April 9th, 2025

[Gemini][LINEBot] 透過 Google ADK 打造一個 Agent LINE Bot

前言

雖然前幾天我才剛將 OpenAI Agents SDK 整合成一個簡單的 LINE Bot 範例程式，就在 20250410 的凌晨， Google 就宣佈了 ADK (Google Agent SDK) 的發佈。

本篇文章將介紹如何透過 Google Agent SDK (ADK) 來打造一個最簡單的 LINE Bot 功能，作為之後 MCP 與其他功能的起始專案。 (想不到才沒隔多久，就可以換成 Google ADK XD)

範例程式碼： https://github.com/kkdai/linebot-adk

05/25更新: SDK 有更改成 asynchronous 形式。

session = await session_service.create_session(app_name=app_name, 
                                    user_id=user_id, 
                                    session_id=session_id)

快速簡介 Google ADK

Repo: https://github.com/google/adk-python

Google 推出的Agent Development Kit (ADK)，這是一個開源框架，旨在簡化智能多代理系統的開發。以下是內容的重點：

ADK是一個開源框架，專為開發多代理系統而設計，提供從構建到部署的全方位支持。
它支持模組化和可擴展的應用程序開發，允許多個專業代理的協作。

功能特點：

內建串流：支持雙向音頻和視頻串流，提供自然的人機互動。
靈活的編排：支持工作流代理和LLM驅動的動態路由。
集成開發者體驗：提供強大的CLI和可視化Web UI，便於開發、測試和調試。
內建評估：系統性地評估代理性能。
簡易部署：支持容器化部署。

支援視覺化測試 WebUI

(Refer: https://google.github.io/adk-docs/get-started/quickstart/#run-your-agent))

可以在本地端透過 WebUI 來做一些快速的測試，快速部署到 Google Cloud 。相關的功能也會在後續的文章中陸續提到。

整合 LINE Bot SDK 需要注意的事項：

接下來跟大家講一下，要加上 LINE Bot SDK 有哪一些需要注意的地方。

範例程式碼： https://github.com/kkdai/linebot-adk

Agent 起始的流程

目前是放在 Services 啟動的時候，就將 Agent 初始化。

# Initialize ADK client
root_agent = Agent(
    name="weather_time_agent",
    model="gemini-2.0-flash-exp",
    description=(
        "Agent to answer questions about the time and weather in a city."
    ),
    instruction=(
        "I can answer your questions about the time and weather in a city."
    ),
    tools=[get_weather, get_current_time],
)
print(f"Agent '{root_agent.name}' created.")

建立 Agent 之後，接下來要準備好 Runner 來執行 Agent 溝通的工作。

# Key Concept: Runner orchestrates the agent execution loop.
runner = Runner(
    agent=root_agent,  # The agent we want to run
    app_name=APP_NAME,   # Associates runs with our app
    session_service=session_service  # Uses our session manager
)

這樣之後就可以透過 async 來呼叫這個 runner 來取的 agent 的結果。（後續會提到）

針對不同使用者，使用記憶體來記憶對話

在 ADK 中，有蠻多相關的 Memory Services 可以使用 :

InMemoryMemoryService
- 使用 Serives 的記憶體來儲存，可以作為基本的儲存方式。但是如果使用 CloudRun ，當服務重啟就會消失掉。
VertexAiRagMemoryService
- 使用 VertexAI 的 RAG 服務，這邊可能會有額外的儲存空間的費用會產生。

接下來分享一下，如何使用 InMemoryMemoryService 來儲存不同用戶的對話記憶。

async def get_or_create_session(user_id):
    if user_id not in active_sessions:
        # Create a new session for this user
        session_id = f"session_{user_id}"
        session = await session_service.create_session(
            app_name=APP_NAME,
            user_id=user_id,
            session_id=session_id
        )
        active_sessions[user_id] = session_id
        print(
            f"New session created: App='{APP_NAME}', User='{user_id}', Session='{session.id}'")
    else:
        # Use existing session
        session_id = active_sessions[user_id]
        print(
            f"Using existing session: App='{APP_NAME}', User='{user_id}', Session='{session_id}'")

    return session_id

首先以上 get_or_create_session() 可以透過 user_id 來建立或是取得使用者的 Session ID。這樣可以讓 ADK 透過正確的 Session ID 來繼續相關的對話。

async def call_agent_async(query: str, user_id: str) -> str:
    """Sends a query to the agent and prints the final response."""
    print(f"\n>>> User Query: {query}")

    # Get or create a session for this user
    session_id = await get_or_create_session(user_id)

    # Prepare the user's message in ADK format
    content = types.Content(role='user', parts=[types.Part(text=query)])

    final_response_text = "Agent did not produce a final response."  # Default

    try:
        # Key Concept: run_async executes the agent logic and yields Events.
        # We iterate through events to find the final answer.
        async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):
            # Key Concept: is_final_response() marks the concluding message for the turn.
            if event.is_final_response():
                if event.content and event.content.parts:
                    # Assuming text response in the first part
                    final_response_text = event.content.parts[0].text
                elif event.actions and event.actions.escalate:  # Handle potential errors/escalations
                    final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
                # Add more checks here if needed (e.g., specific error codes)
                break  # Stop processing events once the final response is found
    except ValueError as e:
        # Handle errors, especially session not found
        print(f"Error processing request: {str(e)}")
        # Recreate session if it was lost
        if "Session not found" in str(e):
            active_sessions.pop(user_id, None)  # Remove the invalid session
            session_id = await get_or_create_session(user_id)  # Create a new one
            # Try again with the new session
            try:
                async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):
                    # Same event handling code as above
                    if event.is_final_response():
                        if event.content and event.content.parts:
                            final_response_text = event.content.parts[0].text
                        elif event.actions and event.actions.escalate:
                            final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
                        break
            except Exception as e2:
                final_response_text = f"Sorry, I encountered an error: {str(e2)}"
        else:
            final_response_text = f"Sorry, I encountered an error: {str(e)}"

    print(f"<<< Agent Response: {final_response_text}")
    return final_response_text

透過以上的程式碼，每一次使用者的資訊 (Query, User_ID). 傳入後，透過不同用戶的 user_id 來建立（或取得）不同溝通的紀錄(Session) 。

再來透過不同的 Session 來跑 ADK 的功能查詢。（主要是透過 async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content): )

這樣就可以達成不同使用者，有不同的記憶內容。也不用另外來呼叫相關記憶體相關的 Function 。

快速總結與未來發展

學習到現在，或許大家會跟我一樣有疑問是：「究竟 OpenAI Agents SDK 跟 Google ADK 有什麼差異？」

Google Chrome 2025-04-10 21.28.23

(表格整理 by Grok3)

就像這個表格整理的一樣，我覺得 ADK 使用起來沒有比 OpenAI Agent SDK 更簡單。但是由於內建許多有用的 WebUI 還有已經打包好許多的工具。讓未來的開發上不會有後顧之憂，接下來也會將 MCP Server，語音或是多模態相關的應用整理近 ADK 來跟大家分享，敬請期待。

March 31st, 2025

[Gemini][LINEBot] 打造一個 OpenAI Agents LINE Bot 並且使用 Google Gemini Model

A sleek, minimal interface displaying a task list for an AI agent, including ‘triage_agent,’ ‘guardrail,’ and ‘update_salesforce_record,’ over a fluid blue abstract background.

前言

OpenAI 在 03/11 發佈了新的 OpenAI-Agent SDK 的套件 (OpenAI-Agents-Python)，裡面不僅僅支援多 Agent 可以相互作用外，還宣佈了可以支援 MCP Server 。

本篇文章將介紹如何透過 OpenAI-Agents SDK 來打造一個最簡單的 LINE Bot 功能，作為之後 MCP 與其他功能的起始專案。

範例程式碼： https://github.com/kkdai/linebot-openai-agent

快速簡介 OpenAI-Agents-SDK

OpenAI-Agent SDK 的套件 (OpenAI-Agents-Python) OpenAI推出了一系列新工具和API，包括Responses API和Agents SDK，這些工具旨在簡化開發者和企業構建智能代理的過程。Responses API結合了Chat Completions API的簡單性和Assistants API的工具使用能力，支持網頁搜索、文件搜索和電腦使用等內建功能。Agents SDK提供了改進的可觀察性和安全檢查，簡化多代理工作流程的編排，並支持智能代理之間的控制轉移，從而提升各行各業的生產力。

並且這個套件同時也提供支援 MCP Server 的功能，詳細部分下一次再介紹。

透過 Custom Provider 來使用 Google Gemini

先來讓 OpenAI-Agents SDK 可以使用其他公司的模型，這邊使用的是 Custom Provider。官方的敘述如下：

這邊我們使用 custom_example_provider.py 範例程式碼來參考，實際完成整合可以看範例程式碼： https://github.com/kkdai/linebot-openai-agent

BASE_URL = os.getenv("EXAMPLE_BASE_URL") or ""
API_KEY = os.getenv("EXAMPLE_API_KEY") or ""
MODEL_NAME = os.getenv("EXAMPLE_MODEL_NAME") or ""

# Initialize OpenAI client
client = AsyncOpenAI(base_url=BASE_URL, api_key=API_KEY)
set_tracing_disabled(disabled=True)

這邊主要需要三個環境參數，以下開始詳細說明：

BASE_URL: 也就是 Custom Provider 的 API 網址，如果要使用 Google Gemini 請記得改成 https://generativelanguage.googleapis.com/v1beta/。
API_KEY: 這邊就寫成自己的 Google Gemini API Key
MODEL_NAME: 這邊記得要改成 Gemini 的 Model ，要省費用可以使用 gemini-1.5-flash。

然後這樣的透過 AsyncOpenAI() 就可以呼叫 Google Gemini 的服務了。

加入簡單的 Tools

這個範例參考原本的 Tools 的寫法，並且加入兩個 Tools 。

@function_tool
def get_weather(city: str):
    """Get weather information for a city"""
    print(f"[debug] getting weather for {city}")
    return f"The weather in {city} is sunny."


@function_tool
def translate_to_chinese(text: str):
    """Translate text to Traditional Chinese"""
    print(f"[debug] translating: {text}")
    return f"Translating to Chinese: {text}"

這邊可以看到有兩個工具可以使用，一個是 translate_to_chinese 另外一個是 get_weather ，當然因為這是一個範例，裡面就直接回覆天氣很好即可。

整合 LINE Bot SDK 跟 OpenAI-Agents-SDK

這邊程式碼分成兩個部分解釋，首先是針對 LINE Bot 的 WebHook 處理的部分。這邊沒有太多其他的部分：

for event in events:
        if not isinstance(event, MessageEvent):
            continue

        if event.message.type == "text":
            # Process text message
            msg = event.message.text
            user_id = event.source.user_id
            print(f"Received message: {msg} from user: {user_id}")

            # Use the user's prompt directly with the agent
            response = await generate_text_with_agent(msg)
            reply_msg = TextSendMessage(text=response)
            await line_bot_api.reply_message(
                event.reply_token,
                reply_msg
            )
        elif event.message.type == "image":
            return 'OK'
        else:
            continue

主要會去呼叫 generate_text_with_agent 並且等待他的結果。

async def generate_text_with_agent(prompt):
    """
    Generate a text completion using OpenAI Agent.
    """
    # Create agent with appropriate instructions
    agent = Agent(
        name="Assistant",
        instructions="You are a helpful assistant that responds in Traditional Chinese (zh-TW). Provide informative and helpful responses.",
        model=OpenAIChatCompletionsModel(
            model=MODEL_NAME, openai_client=client),
        tools=[get_weather, translate_to_chinese],
    )

    try:
        result = await Runner.run(agent, prompt)
        return result.final_output
    except Exception as e:
        print(f"Error with OpenAI Agent: {e}")
        return f"抱歉，處理您的請求時出現錯誤: {str(e)}"

這邊使用 OpenAIChatCompletionModel 的時候，需要透過 model=MODEL_NAME, openai_client=client) 來使用 Custom Model Provider 。這樣才能正確使用到 Google Gemini 的 Model 。

要使用 Tools ，就必須要使用到 tools=[get_weather, translate_to_chinese] 將所有支援的 Tools 加入進來。才能正確引用到。

成果與如何使用

部署完畢後，這是一個簡單的截圖。使用也很簡單，直接就詢問他問題。如果跟翻譯有關，就會直接使用到 translate_to_chinese ，如果要抓取城市的天氣，他則會先跟你確認清楚城市的名稱後，一率都回覆你天氣晴朗。

快速總結與未來發展

本篇文章提供了範例程式碼，並且提供了如何使用 OpenAI-Agent SDK 的套件 (OpenAI-Agents-Python) 來串接 LINE Bot SDK 。之後的幾篇文章，我們將開始串接一些有用的 MCP Server 並且讓我們的 LINE Bot 有更完整的功能。

March 27th, 2025

[Google Cloud] GCP 上面避免 YouTube 阻擋同區網路流量（透過 Proxy 解決)

LINE 2025-03-28 15.31.08

前言

之前有一篇文章「[Google Cloud] 如何在 GCP Cloud Run 上面透過 LangChain 取得 YouTube 的相關資訊」，雖然有講過使用 Secret Manager 與 GCP. 相關的 LangChain YouTube 套件來試著抓取資料。但是近期 YouTube 又開始修該他的讀取規範，造成原來的方式不能成功，這裡記錄一下主要錯誤訊息，還有該如何解決。

主要的問題

有一天 YouTube 的字幕開始抓不到，查詢 Log 出現以下內容。

During handling of the above exception, another exception occurred:
youtube_transcript_api._errors.RequestBlocked:
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=ViA4-YWx8Y4! 
This is most likely caused by:
YouTube is blocking requests from your IP. This usually is due to one of the following reasons:
- You have done too many requests and your IP has been blocked by YouTube
- You are doing requests from an IP belonging to a cloud provider (like AWS, Google Cloud Platform, Azure, etc.). Unfortunately, most IPs from cloud providers are blocked by YouTube.

There are two things you can do to work around this:
1. Use proxies to hide your IP address, as explained in the "Working around IP bans" section of the README (https://github.com/jdepoix/youtube-transcript-api?tab=readme-ov-file#working-around-ip-bans-requestblocked-or-ipblocked-exception).
2. (NOT RECOMMENDED) If you authenticate your requests using cookies, you will be able to continue doing requests for a while. However, YouTube will eventually permanently ban the account that you have used to authenticate with! So only do this if you don't mind your account being banned!
If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtub

這邊快速提到主要出現的問題：

由於透過 LangChain 架構中， GoogleApiYoutubeLoader 也是呼叫 https://github.com/jdepoix/youtube-transcript-api/ 來使用。
由於部署在 GCP 上面，造成 YouTube 開始阻擋所有同一個雲端的需求。
這邊套件是建議使用 Webshare 這類型的 Proxy 服務，來去抓取相關資料。

WebShare

Webshare 是一個第三方的付費 Proxy 服務，可以讓你的 Web Request 可以透過這個服務達到以下的相關事項：

想要在 GCP 上面存取（爬蟲） Google 服務（Map, YouTube, Google Search)
想要爬一些比較有阻擋 CSP (Cloud Services Provider) IP 的 CDN 服務（比如說 CloudFlare)

他也有相關的免費 Proxy 額度可以使用：

五個 Proxy
一個月 1GB usage

對於 Youtube 字幕來說，這樣流量不會是問題。

使用上也蠻簡單的，這裡附上透過 Webshare 抓取 YouTube Transcript 的程式碼：

from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.proxies import WebshareProxyConfig
import os


def get_transcripts(video_id, languages):
    # Get proxy credentials from environment variables
    proxy_username = os.environ.get("PROXY_USERNAME")
    proxy_password = os.environ.get("PROXY_PASSWORD")

    ytt_api = YouTubeTranscriptApi(
        proxy_config=WebshareProxyConfig(
            proxy_username=proxy_username,
            proxy_password=proxy_password,
        )
    )
    transcript_list = ytt_api.fetch(video_id, languages=languages)
    transcript_texts = [snippet["text"] for snippet in transcript_list.to_raw_data()]
    return " ".join(transcript_texts)


# Example usage (only runs when script is executed directly)
if __name__ == "__main__":
    video_id = "YOUR_VIDEO_ID"
    languages = ["en", "de"]
    transcript_text = get_transcripts(video_id, languages)
    print(transcript_text)

比起直接連線來說，雖然會比較慢一點。但是真的可以直接抓取資料。

可能要避免的部分（額外費用)

雖然 Webshare 有免費的額度又好用，但是如果你抓取次數太頻繁可能會被 Webshare 阻擋掉叫你去付錢。這邊要小心。

youtube_transcript_api._errors.RequestBlocked:

Could not retrieve a transcript for the video https://www.youtube.com/watch?v=ViA4-YWx8Y4! This is most likely caused by:

YouTube is blocking your requests, despite you using Webshare proxies. Please make sure that you have purchased "Residential" proxies and NOT "Proxy Server" or "Static Residential", as those won't work as reliably! The free tier also uses "Proxy Server" and will NOT work!

The only reliable option is using "Residential" proxies (not "Static Residential"), as this allows you to rotate through a pool of over 30M IPs, which means you will always find an IP that hasn't been blocked by YouTube yet!

You can support the development of this open source project by making your Webshare purchases through this affiliate link: https://www.webshare.io/?referral_code=1yl49cgzfedr
Thank you for your support! <3

March 22nd, 2025

[Gemini][MCP] 在 Cline 上面使用 Gemini 來呼叫 MCP 的功能

What is Model Context Protocol (MCP)? How it simplifies AI ...

前情提要

最近 MCP 是非常熱門的討論話題，但是大家提到 MCP 不免會想到 Anthropic 的 Claude 或是其他語言的模型。這一篇文章要告訴大家關於 MCP 的一些基礎原理，並且如何使用 Google Gemini 來呼叫 MCP 。希望能給大家一些整理。

什麼是 MCP（Model Context Protocol）

根據 Anthropic 的文件上面有提到: MCP 是一個開放協議，用於標準化應用程式如何為大型語言模型（LLM）提供上下文。您可以將 MCP 想像成 AI 應用程式的 USB-C 接口。就像 USB-C 為您的設備提供了一個標準化的方式來連接各種外圍設備和配件一樣，MCP 為 AI 模型提供了一個標準化的方式來連接不同的數據源和工具。

這邊也分享 YT https://www.youtube.com/watch?v=McNRkd5CxFY&t=17s 上面的架構圖，讓大家更容易理解

Google Chrome 2025-03-23 16.41.43

（圖片來源技术爬爬虾 TechShrimp :MCP是啥？技术原理是什么？一个视频搞懂MCP的一切。Windows系统配置MCP，Cursor,Cline 使用MCP）

這邊可以看出來，透過 MCP 這邊提到的 AI 客戶端（大家常使用的 ChatGPT, Claude, Gemini 的等等）都可以透過 MCP 架構直接來對這些服務做「操作」。

MCP 服務中的架構圖

(架構圖： MCP Core architecture)

這個架構圖，有清楚的敘述出關於 MCP 的 Client Server 的架構，這邊再強調一下。

MCP Host: 使用這些 MCP 服務的應用，可能是 Cline, Windsurf 或是 Claude Desktop)
MCP Server: 許多地方應該都要介紹過，這裡不太贅述。之後也會有範例程式碼。就是一個萬用的溝通協定，讓每一個 MCP Host 可以更容易去使用一些外部功能。並且變成一個共通的協定。
MCP Client: 在每一個 Host 中，確定使用某個 MCP Server 後。在 Host 中，會有其相關的 client 這裡將會是用 Prompt 存在著。接下來會詳細敘述。

MCP 運作細節

完整細節可以參考網路上這位的說明，不過我將內容改成 Google Gemini 相關的應用。除了換掉影片中使用的 DeepSeek 之外，也可以讓整個使用更符合資訊安全相關的應用。

參考影片：

技术爬爬虾 TechShrimp MCP是怎么对接大模型的？抓取AI提示词，拆解MCP的底层原理

透過 CloudFlare 建立 AI API Gateway

(如果要查看相關細節，需要使用 OpenRouter 或是 OpenAI Compatible )

建立一個 Cloudflare 帳號
建立 AI -> API Gateway
選項選擇 OpenRouter ，然後記得去 OpenRouter 申請帳號

如果要查看 MCP 溝通細節，就得使用 OpenAI 或是 OpenRouter 。這裡查看影片可以看到完整教學。這裡將直接貼出相關細節。

Code 2025-03-23 17.35.03

以上是透過 Cloudflare 抓取封包後，來解析 MCP 的溝通機制：

你會發現 MCP Server 所有安裝的 Server 跟功能都被當作 Promopt 輸入給該模型。
每一個 MCP Server 能做什麼，會寫在 Prompt 之中（圖片可以看到 Github Repository MCP Server)

以上這些資料很重要，稍後會在下一個章節 MCP 跟 Function Call 來討論。

MCP 跟 Function Call 的差別

Expanding AI Horizons: The Rise of Function Calling in LLMs

原本 LLM 有一些模型會透過 Function Calling 來決定要如何呼叫一些工具，這裡其實需要 Model 額外的支持，雖然說大家常見的 LLM 提供商 OpenAI, Google, Anthropic 都已經支援 Function Calling ，但是每一次都要為了這些工具額外去寫相關的 Function Calling App 還是很讓人頭痛。

這也是為什麼 MCP 會紅的原因，他讓原本的 Function Calling App 變成一個共用機制。每一個相關的應用廠商，都可以寫出自己的 MCP Servers ，也可以讓使用的服務變得更多人可以使用。

並且，很重要的：

MCP 讓不能使用 Function Calling 的 Model 也能使用 MCP Servers 做出類似 Function Calling 的應用
MCP 讓不能使用 Function Calling 的 Model 也能使用 MCP Servers 做出類似 Function Calling 的應用
MCP 讓不能使用 Function Calling 的 Model 也能使用 MCP Servers 做出類似 Function Calling 的應用

因為技术爬爬虾 TechShrimp MCP是怎么对接大模型的？抓取AI提示词，拆解MCP的底层原理影片中使用到的相關模型是 DeepSeekChat 其實還不支援 Function Calling ，但是他也是可以在透過 MCP Server 來互動。

使用 Cline 作為 MCP host

(Refer: Cline Plugin)

Cline 是一個 VS Code 的 Plugin ，目前已經有相當多的類似 AI IDE Plugin 。但是由於 Cline 本身內建支援 MCP Server 的溝通機制，並且有一個內建的推薦清單，讓開發者可以很快速地連接 MCP Servers ，這裡很推薦使用 Cline 作為你第一個了解 MCP 的工具。

未來發展:

本篇文章快速做了一個簡介，並且使用 Cline 搭配了 Gemini 1.5-flash model 來呼叫 MCP Server 做一個查詢時間的展示。接下來的文章將會告訴大家，該如何寫一個 MCP Server ，並且分享給大家可能有用的應用。

參考資料：

March 21st, 2025

[Gemini] 讓 Gemini 根據你問題的關鍵字，透過 Google Custom Search 找到結果總結回覆

前情提要

一直以來我們都知道透過 LLM 上的 function call ，是可以讓 LLM 有了使用工具的能力。可以去做網頁搜尋，或是查詢資料庫，甚至是做一些特殊的工作。

而在處理網頁搜尋的時候，通常是會使用到一些外部服務 SerpAPI 相關的付費服務。如果你原本的服務是建置在 Google Cloud Platform 上面，有沒有想過是否有可以使用的服務呢？本篇文章就來介紹一下，如何透過 Google Custom Search API 搜尋 Google 並且將結果總結回覆。

如何快速取的 Google Search 的網頁

以往來說，如果你是在本地端直接來呼叫，可以透過 https://www.google.com/search?q=YOUR_KEYWORD 來直接呼叫網址顯示相關的搜尋結果。這個時候，你也可以透過一些 Crawler 來搜尋以下的結果，但是…. 但是

如果你服務放在 GCP 不能直接爬 Google 網頁
如果你服務放在 GCP 不能直接爬 Google 網頁
如果你服務放在 GCP 不能直接爬 Google 網頁

這個時候，就應該要去思考有沒有其他的方式可以解決。

Google Custom Search JSON API

(相關說明網址)

Google Custom Search JSON API 提供了免費提供每天 100 個搜尋查詢。如果您需要更多，請在 API 控制台中申請billing功能。額外要求的費用為每 1 千筆查詢 $5 美元，每日最多 1 萬次查詢。

接下來給大家一個範例程式碼，看要如何呼叫 Google Custom Search JSON API

def search_with_google_custom_search(keywords, search_api_key, cx, num_results=10):
    """
    使用 Google Custom Search API 根據關鍵字進行搜尋。

    :param keywords: 關鍵字列表
    :param search_api_key: Google Custom Search API 的 API 金鑰
    :param cx: 搜尋引擎 ID
    :param num_results: 要返回的搜尋結果數量，預設為 10
    :return: 搜尋結果列表，每個結果包含標題、連結和摘要
    """
    query = " ".join(keywords)  # 將關鍵字組合成搜尋查詢
    url = f"https://www.googleapis.com/customsearch/v1?key={search_api_key}&cx={cx}&q={query}&num={num_results}"

    try:
        logger.info(f"Searching for: {query}")
        response = requests.get(url)
        response.raise_for_status()  # 如果請求失敗，拋出異常
        result_data = response.json()

        # Check if there are search results
        if "items" not in result_data:
            logger.warning(f"No search results for query: {query}")
            return []

        results = result_data.get("items", [])
        formatted_results = []
        for item in results:
            formatted_results.append(
                {
                    "title": item.get("title", "No title"),
                    "link": item.get("link", ""),
                    "snippet": item.get("snippet", "No description available"),
                }
            )
        return formatted_results
    except requests.exceptions.RequestException as e:
        logger.error(f"Google Custom Search API 請求失敗：{e}")
        return []

這裡你會需要兩個數值:

search_api_key: 這個你在說明網址會取得
cx: 這個是 Google Search ID ，這個 Custom Search Dashboard 取得你自己的 Search Windows ID。

取得關鍵字

另外一個需要詳細處理的部分在於，如果你的問題是一整句話，如果要放入 Google Search 的話，比較好的方式是找出「關鍵詞」。這個部分，我們會使用 Gemini 來處理。

def extract_keywords_with_gemini(text, gemini_api_key, num_keywords=5):
    """
    使用 Gemini API 從文字中提取關鍵字。

    :param text: 使用者輸入的文字
    :param gemini_api_key: Gemini API 的 API 金鑰
    :param num_keywords: 要提取的關鍵字數量，預設為 5
    :return: 提取的關鍵字列表
    """
    try:
        # 設定 API 金鑰
        current_key = genai.get_api_key()

        # Only configure if needed (prevent reconfiguring when already set correctly)
        if current_key != gemini_api_key:
            genai.configure(api_key=gemini_api_key)

        # 建立 Gemini 模型
        model = genai.GenerativeModel("gemini-1.5-flash")

        # 準備提示詞，要求模型提取關鍵字
        prompt = f"""從以下文字中提取 {num_keywords} 個最重要的關鍵字或短語，只需返回關鍵字列表，不要有額外文字：

{text}

關鍵字："""

        # 生成回應
        response = model.generate_content(prompt)

        # 處理回應，將文字分割成關鍵字列表
        if response.text:
            # 清理結果，移除數字、破折號和多餘空白
            keywords_text = response.text.strip()
            # 分割文字得到關鍵字列表
            keywords = [kw.strip() for kw in keywords_text.split("\n")]
            # 移除可能的數字前綴、破折號或其他標點符號
            keywords = [kw.strip("0123456789. -\"'") for kw in keywords]
            # 移除空項
            keywords = [kw for kw in keywords if kw]
            return keywords[:num_keywords]  # 確保只返回指定數量的關鍵字
        return []
    except Exception as e:
        logger.error(f"Gemini API 提取關鍵字失敗：{e}")
        # If direct text contains useful terms, use it directly
        if len(text) < 100:  # If the text is short, it might be a good search query already
            return [text]
        return []

以上的程式碼，會根據你的詢問去抓取出最多五個關鍵詞，幫助做搜尋使用。這裡使用到的是 gemini-1.5-flash 。

資訊總結的部分：

資訊總結的部分，這裡使用 LangChain 來呼叫。大家可以快速參考一下相關程式碼：

def summarize_text(text: str, max_tokens: int = 100) -> str:
    '''
    Summarize a text using the Google Generative AI model.
    '''
    llm = ChatGoogleGenerativeAI(
        model="gemini-1.5-flash",
        temperature=0,
        max_tokens=None,
        timeout=None,
        max_retries=2,
    )

    prompt_template = """用台灣用語的繁體中文，簡潔地以條列式總結文章重點。在摘要後直接加入相關的英文 hashtag，以空格分隔。內容來源可以是網頁、文章、論文、影片字幕或逐字稿。

    原文： "{text}"
    請遵循以下步驟來完成此任務：

    # 步驟
    1. 從提供的內容中提取重要重點，無論來源是網頁、文章、論文、影片字幕或逐字稿。
    2. 將重點整理成條列式，確保每一點為簡短且明確的句子。
    3. 使用符合台灣用語的簡潔繁體中文。
    4. 在摘要結尾處，加入至少三個相關的英文 hashtag，並以空格分隔。

    # 輸出格式
    - 重點應以條列式列出，每一點應為一個短句或片語，語言必須簡潔明瞭。
    - 最後加入至少三個相關的英文 hashtag，每個 hashtag 之間用空格分隔。

    # 範例
    輸入：
    文章內容：
    台灣的報告指出，環境保護的重要性日益增加。許多人開始選擇使用可重複使用的產品。政府也實施了多項政策來降低廢物。

    摘要：

    輸出：
    - 環境保護重要性增加
    - 越來越多人使用可重複產品
    - 政府實施減廢政策
    #EnvironmentalProtection #Sustainability #Taiwan
    """

    prompt = PromptTemplate.from_template(prompt_template)

    summarize_chain = load_summarize_chain(
        llm=llm, chain_type="stuff", prompt=prompt)
    document = Document(page_content=text)
    summary = summarize_chain.invoke([document])
    return summary["output_text"]

未來發展

本來做這個只是為了讓我的「資訊小幫手」，希望他可以有更多基本功能來整合 Google Search 並且試著總結一些問題。發現了有 Custom Search API 之後，或許之後可以有更多有趣的應用。

也歡迎大家自己部署自己的「資訊小幫手」。如果你有一些更特別的應用，也歡迎告訴我。

March 4th, 2025

[Python] 在 LangChain 中將 Gemini 換成使用 Vertex AI

前情提要

前面提供相當多透過 LangChain 來打造一個 LINE Bot 的案例。但是如果希望使用更穩定的後台，並且希望使用更多 AI 相關的功能，那麼 Vertex AI 就是就是一個很好的選擇。接下來會開始逐步介紹整個移植過程並且介紹需要介紹的地方，還有可能會出現的問題。

範例程式碼：

https://github.com/kkdai/linebot-gemini-python

(透過這個程式碼，可以快速部署到 GCP Cloud Run)

透過 LangChain 與 Gemini 打造 LINE Bot 到 Vertex AI

首先先給各位一個簡單的 LangChain + Gemini 打造 LINE Bot 的範例程式碼：

處理 Webhook 相關程式碼：

    for event in events:
        if not isinstance(event, MessageEvent):
            continue

        if (event.message.type == "text"):
            # Process text message using LangChain
            msg = event.message.text
            response = generate_text_with_langchain(f'{msg}, reply in zh-TW:')
            reply_msg = TextSendMessage(text=response)
            await line_bot_api.reply_message(
                event.reply_token,
                reply_msg
            )

接下來解釋一下 generate_text_with_langchain 的內容：

# Initialize LangChain with Gemini
os.environ["GOOGLE_API_KEY"] = gemini_key

....

def generate_text_with_langchain(prompt):
    """
    Generate a text completion using LangChain with Gemini model.
    """
    # Create a chat prompt template with system instructions
    prompt_template = ChatPromptTemplate.from_messages([
        SystemMessage(
            content="You are a helpful assistant that responds in Traditional Chinese (zh-TW)."),
        HumanMessage(content=prompt)
    ])

    # Format the prompt and call the model
    formatted_prompt = prompt_template.format_messages()
    response = text_model.invoke(formatted_prompt)

    return response.content

這就是部分片段的透過 LangChain 加上 Gemini 來打造 LINE Bot 的程式碼，完成程式碼。

什麼是 Vertex AI？有什麼特別功能？

Typora 2025-03-05 22.17.07

Vertex AI 是 Google Cloud 提供的一個全面性機器學習平台，旨在簡化模型的訓練、部署和管理。它特別適合需要企業級解決方案的開發者。與直接使用 Gemini API 不同，Vertex AI 提供了以下獨特功能：

整合 Gemini 模型與更多選擇 Vertex AI 支援 Gemini 系列模型（如 gemini-pro 和 gemini-1.5-flash），同時也支援其他模型（如 PaLM 和 Codey），讓你有更多選擇來滿足不同需求。
託管運行時與自動擴展透過 Vertex AI 的 Reasoning Engine，你可以輕鬆部署 LangChain 應用程式，並享受自動擴展、安全性和監控功能，無需自行管理伺服器。
多模態支援 Vertex AI 的 Gemini 模型支援多模態輸入（文字、圖片、甚至影片），這對於打造功能豐富的 LINE Bot（如處理圖片訊息）非常有用。
企業級安全性與合規性 Vertex AI 提供 IAM（身份與存取管理）、資料加密和區域擴展功能，確保應用程式符合企業需求。
工具整合與函數調用 Vertex AI 支援 Function Calling，讓模型可以與外部 API 或工具互動，例如天氣查詢或資料庫搜尋，提升 LINE Bot 的實用性。

開始移植 Gemini 到 Vertex AI

首先會需要兩個主要參數

[Project_ID] : 你的 GCP Project ID
[Location] : Vertex AI 的地區! Vertex AI 的地區! Vertex AI 的地區! (很重要講三次)

接下來這裡有一個很好的 Colab 可以讓你去了解一下。

主要程式碼如下：

import os
from google import genai
from google.genai.types import (
    FunctionDeclaration,
    GenerateContentConfig,
    GoogleSearch,
    HarmBlockThreshold,
    HarmCategory,
    MediaResolution,
    Part,
    Retrieval,
    SafetySetting,
    Tool,
    ToolCodeExecution,
    VertexAISearch,
)

PROJECT_ID = ""  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
if not PROJECT_ID or PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

# 設定 model
MODEL_ID = "gemini-2.0-flash-001"  # @param {type: "string"}

# 產生輸出
response = client.models.generate_content(
    model=MODEL_ID, contents="What's the largest planet in our solar system?"
)

display(Markdown(response.text))

那該如何從 LangChain + Gemini 移植到 LangChain + VertexAI 呢？

跟據 LangChain 的文件上，主要使用 Google Vertex AI 會使用到 langchain_google_vertexai 裡面的 VertexAI ，需要注意的事情如下：

如果放在 GCP 的 Cloud Run 就不需要放 Services Account 的 JSON 內容
如果不是放在 GCP 上，就必須要將 JSON 內容放入 GOOGLE_APPLICATION_CREDENTIALS 系統參數中。

如果建立在 GCP 上面，以下是最基本的啟動方式。

# Create LangChain Vertex AI model instances
# For Vertex AI, we use "gemini-2.0-flash" instead of "gemini-2.0-flash-lite"
text_model = ChatVertexAI(
    model_name="gemini-2.0-flash-001",
    project=google_project_id,
    location=google_location,
    max_output_tokens=1024
)

接下來透過 LangChain 呼叫 Vertex AI 的程式碼如下：

def generate_text_with_langchain(prompt):
    """
    Generate a text completion using LangChain with Vertex AI model.
    """
    # Create a chat prompt template with system instructions
    prompt_template = ChatPromptTemplate.from_messages([
        SystemMessage(
            content="You are a helpful assistant that responds in Traditional Chinese (zh-TW)."),
        HumanMessage(content=prompt)
    ])

    # Format the prompt and call the model
    formatted_prompt = prompt_template.format_messages()
    response = text_model.invoke(formatted_prompt)

    return response.content

這樣就可以，本次只注重將 LangChain Gemini 在文字方面的部分，轉換成使用 Vertex AI 。但是因為 Vertex AI 上面針對圖片的使用都會透過 Google Cloud Storage ，所以本次將先不專注在相關的部分，想了解的可以

需要注意的問題：

關於以下錯誤該如何解決?

details = "Publisher Model `projects/PROJECT_ID/locations/asia-east1/publishers/google/models/gemini-2.0-flash` not found."

會發生這個問題，主要是因為如果你選擇了 asia-east1 作為你 Vertex AI 的區域。他目前是沒有支援 gemini-2.0 的相關模型喔。

20250625 更新：後來 Google 官方更新要使用 global 會更好，也更不容易出錯。

Google Chrome 2025-06-26 10.23.14

[Gemini][LINEBot] 透過 Google ADK 打造一個 Agent LINE Bot

前言

範例程式碼： https://github.com/kkdai/linebot-adk

05/25更新: SDK 有更改成 asynchronous 形式。

快速簡介 Google ADK

Repo: https://github.com/google/adk-python

支援視覺化測試 WebUI

整合 LINE Bot SDK 需要注意的事項：

範例程式碼： https://github.com/kkdai/linebot-adk

Agent 起始的流程

針對不同使用者，使用記憶體來記憶對話

快速總結與未來發展

[Gemini][LINEBot] 打造一個 OpenAI Agents LINE Bot 並且使用 Google Gemini Model

前言

範例程式碼： https://github.com/kkdai/linebot-openai-agent

快速簡介 OpenAI-Agents-SDK

透過 Custom Provider 來使用 Google Gemini

加入簡單的 Tools

整合 LINE Bot SDK 跟 OpenAI-Agents-SDK

成果與如何使用

快速總結與未來發展

[Google Cloud] GCP 上面避免 YouTube 阻擋同區網路流量（透過 Proxy 解決)

前言

主要的問題

WebShare

可能要避免的部分（額外費用)

[Gemini][MCP] 在 Cline 上面使用 Gemini 來呼叫 MCP 的功能

前情提要

什麼是 MCP（Model Context Protocol）

MCP 服務中的架構圖

MCP 運作細節

參考影片：

透過 CloudFlare 建立 AI API Gateway

MCP 跟 Function Call 的差別

使用 Cline 作為 MCP host

相關 Cline 設定流程，讓你使用 MCP Servers

未來發展:

參考資料：

[Gemini] 讓 Gemini 根據你問題的關鍵字，透過 Google Custom Search 找到結果總結回覆

前情提要

如何快速取的 Google Search 的網頁

Google Custom Search JSON API

取得關鍵字

資訊總結的部分：

未來發展

[Python] 在 LangChain 中將 Gemini 換成使用 Vertex AI

前情提要

範例程式碼：

透過 LangChain 與 Gemini 打造 LINE Bot 到 Vertex AI

什麼是 Vertex AI？有什麼特別功能？

開始移植 Gemini 到 Vertex AI

那該如何從 LangChain + Gemini 移植到 LangChain + VertexAI 呢？

需要注意的問題：

關於以下錯誤該如何解決?