A Coding Implementation of an Advanced Tool-Using AI Agent with Semantic Kernel and Gemini

Creating a Sophisticated AI Agent with Semantic Kernel and Google Gemini on Colab

This guide demonstrates how to develop a robust AI agent by integrating Semantic Kernel with Google’s free Gemini model, running effortlessly on Google Colab. We begin by configuring Semantic Kernel plugins as functional tools-such as web search, mathematical computations, file operations, and note management-and then orchestrate these tools through Gemini’s structured JSON responses. The agent intelligently plans, invokes tools, interprets feedback, and ultimately delivers a comprehensive answer.

Setting Up the Environment and Dependencies

!pip -q install semantic-kernel google-generativeai duckduckgo-search rich

import os
import re
import json
import time
import math
import pathlib
import getpass
import textwrap
import typing as T
from rich import print
import google.generativeai as genai
from duckduckgosearch import DDGS

Configure Gemini API key and model
GEMINIAPIKEY = os.getenv("GEMINIAPIKEY") or getpass.getpass("🔐 Enter GEMINIAPIKEY: ")
genai.configure(apikey=GEMINIAPIKEY)
GEMINIMODEL = "gemini-1.5-flash"
model = genai.GenerativeModel(GEMINIMODEL)

import semantickernel as sk
try:
    from semantickernel.functions import kernelfunction
except ImportError:
    from semantickernel.utils.functiondecorator import kernelfunction

We start by installing necessary packages and importing key modules, including Semantic Kernel, Google Gemini, and DuckDuckGo search. The Gemini API key is securely set up, and the model is initialized to generate AI responses. Semantic Kernel’s kernelfunction decorator is prepared to register custom tools.

Defining the Agent’s Toolkit with Semantic Kernel

class AgentTools:
    """A collection of Semantic Kernel tools accessible by the AI agent."""

    def init(self):
        self.notes: list[str] = []

    @kernelfunction(name="websearch", description="Perform a web search and return JSON with {title, href, body}.")
    def websearch(self, query: str, k: int = 5) -> str:
        k = max(1, min(int(k), 10))
        results = list(DDGS().text(query, maxresults=k))
        return json.dumps(results[:k], ensureascii=False)

    @kernelfunction(name="calc", description="Safely evaluate mathematical expressions like '4173+5' or 'sin(pi/4)2'.")
    def calc(self, expression: str) -> str:
        allowed = {"builtins": {}}
        for const in ("pi", "e", "tau"):
            allowed[const] = getattr(math, const)
        for func in ("sin", "cos", "tan", "asin", "sqrt", "log", "log10", "exp", "floor", "ceil"):
            allowed[func] = getattr(math, func)
        return str(eval(expression, allowed, {}))

    @kernelfunction(name="now", description="Return the current local date and time as a string.")
    def now(self) -> str:
        return time.strftime("%Y-%m-%d %H:%M:%S")

    @kernelfunction(name="writefile", description="Save text content to a specified file path; returns the saved path.")
    def writefile(self, path: str, content: str) -> str:
        p = pathlib.Path(path).expanduser().resolve()
        p.parent.mkdir(parents=True, existok=True)
        p.writetext(content, encoding="utf-8")
        return str(p)

    @kernelfunction(name="readfile", description="Read and return up to 4000 characters from a file at the given path.")
    def readfile(self, path: str) -> str:
        p = pathlib.Path(path).expanduser().resolve()
        return p.readtext(encoding="utf-8")[:4000]

    @kernelfunction(name="addnote", description="Add a brief note to the agent's memory.")
    def addnote(self, note: str) -> str:
        self.notes.append(note.strip())
        return f"Notes stored: {len(self.notes)}"

    @kernelfunction(name="searchnotes", description="Search stored notes for a keyword and return top matches.")
    def searchnotes(self, query: str) -> str:
        q = query.lower()
        matches = [note for note in self.notes if q in note.lower()]
        return json.dumps(matches[:10], ensureascii=False)


Initialize Semantic Kernel and register tools
kernel = sk.Kernel()
tools = AgentTools()
kernel.addplugin(tools, "agenttools")

Here, we create the AgentTools class, equipping the AI with capabilities such as web searching, secure math evaluation, current time retrieval, file reading/writing, and note-taking. These tools are then registered with Semantic Kernel, enabling the agent to invoke them dynamically during its reasoning process.

Cataloging Tools and Crafting the Agent’s Instruction Set

def listtools() -> dict[str, dict]:
    registry = {}
    for name in ("websearch", "calc", "now", "writefile", "readfile", "addnote", "searchnotes"):
        fn = getattr(tools, name)
        desc = getattr(fn, "description", "") or fn.doc or ""
        sig = "()" if name == "now" else "(kwargs)"
        registry[name] = {"callable": fn, "description": desc.strip(), "signature": sig}
    return registry

TOOLS = listtools()

CATALOG = "n".join(
    [f"- {name}{info['signature']}: {info['description']}" for name, info in TOOLS.items()]
)

SYSTEMPROMPT = f"""You are a precise AI agent that uses tools.
Invoke TOOLS by returning ONLY a JSON object:
{{"tool":"name>","args":{{...}}}}
When finished, respond with:
{{"finalanswer":""}}

Available TOOLS:
{CATALOG}

Guidelines:

Prioritize accuracy; cite websearch results as title.

Limit to a maximum of 8 tool calls.
For file outputs, use writefile and specify the saved file path.
If a tool fails, modify arguments and retry.

"""

We compile a registry of all available tools, including their descriptions and expected signatures, then embed this information into a system prompt. This prompt instructs Gemini to interact with tools strictly via JSON commands and to provide a final answer with proper citations and reasoning steps.

Parsing Model Outputs and Managing the Agent’s Workflow

def extractjson(text: str) -> dict | None:
    for match in re.finditer(r"{.}", text, flags=re.S):
        try:
            return json.loads(match.group(0))
        except json.JSONDecodeError:
            continue
    return None

def runagent(task: str, maxsteps: int = 8, verbose: bool = True) -> str:
    transcript: list[dict] = [
        {"role": "system", "parts": [SYSTEMPROMPT]},
        {"role": "user", "parts": [task]}
    ]
    observations = ""

    for step in range(1, maxsteps + 1):
        content = []
        for message in transcript:
            role = message["role"]
            for part in message["parts"]:
                content.append({"text": f"[{role.upper()}]n{part}n"})
        if observations:
            content.append({"text": f"[OBSERVATIONS]n{observations[-4000:]}n"})

        response = model.generatecontent(content, requestoptions={"timeout": 60})
        text = response.text or ""

        if verbose:
            print(f"n[Step {step} - Model Response]n{textwrap.shorten(text, 1000)}")

        command = extractjson(text)
        if not command:
            transcript.append({"role": "user", "parts": ["Please respond with exactly one JSON object as per instructions."]})
            continue

        if "finalanswer" in command:
            return command["finalanswer"]

        if "tool" in command:
            toolname = command["tool"]
            args = command.get("args", {})
            if toolname not in TOOLS:
                observations += f"nToolError: Unknown tool '{toolname}'."
                continue
            try:
                output = TOOLStoolname]["callable"
                outputstr = output if isinstance(output, str) else json.dumps(output, ensureascii=False)
                if len(outputstr) > 4000:
                    outputstr = outputstr[:4000] + "...[truncated]"
                observations += f"n[{toolname}] {outputstr}"
                transcript.append({"role": "user", "parts": [f"Observation from {toolname}:n{outputstr}"]})
            except Exception as e:
                observations += f"nToolError {toolname}: {e}"
                transcript.append({"role": "user", "parts": [f"ToolError {toolname}: {e}"]})
        else:
            transcript.append({"role": "user", "parts": ["Output must be a single JSON object with a tool call or finalanswer."]})

    return "Step limit reached. Summary of observations:n" + observations[-1500:]

This function manages the agent’s iterative reasoning cycle. It sends the system and user context to Gemini, enforces JSON-only responses for tool invocations, executes the requested tools, and feeds the results back into the conversation. If the model deviates from the expected format, it is prompted to correct itself. Upon reaching the maximum number of steps, the agent summarizes its findings.

Demonstration: Multi-Task Agent in Action

DEMOTASK = (
    "Retrieve the top 3 concise facts about Chandrayaan-3 with sources, "
    "calculate 4173+5, save a 3-line summary to '/content/notes.txt', "
    "add this summary to notes, then display the current time and provide a final answer."
)

if name == "main":
    print("[bold]🔧 Loaded Tools:[/bold]", ", ".join(TOOLS.keys()))
    finalresponse = runagent(DEMOTASK, maxsteps=8, verbose=True)
    print("n" + "="*80 + "n[bold green]FINAL ANSWER[/bold green]n" + finalresponse + "n")

In this example, the agent performs a series of tasks: it searches the web for verified facts, performs a mathematical calculation, writes a summary to a file, stores notes internally, and reports the current time. The entire workflow is executed end-to-end, showcasing the seamless integration of tool usage and reasoning.

Summary and Future Directions

This tutorial highlights how Semantic Kernel and Google Gemini can be combined to create a compact yet powerful AI agent within the Google Colab environment. The agent not only calls external tools but also integrates their outputs into its reasoning loop, producing well-structured and cited final answers. This modular framework serves as a solid foundation for expanding with additional tools or more complex tasks, demonstrating that building advanced AI agents can be straightforward and efficient when leveraging the right technologies.

Explore more about AI agent development and stay updated with the latest advancements by following our content and joining our community channels.

A Coding Implementation of an Advanced Tool-Using AI Agent with Semantic Kernel and Gemini

Creating a Sophisticated AI Agent with Semantic Kernel and Google Gemini on Colab

Setting Up the Environment and Dependencies

Configure Gemini API key and model

Defining the Agent’s Toolkit with Semantic Kernel

Initialize Semantic Kernel and register tools

Cataloging Tools and Crafting the Agent’s Instruction Set

Parsing Model Outputs and Managing the Agent’s Workflow

Demonstration: Multi-Task Agent in Action

Summary and Future Directions

The AI lab revolving door spins ever faster

Flutterwave goes deeper into stablecoins with Turnkey-powered wallets for merchants

Sophos Launches Browser-Based Security Product Targeting Hybrid Work & AI Risks

Razer’s Project Ava: AI now goes in a cannister on your...

Recomended

The AI lab revolving door spins ever faster

Flutterwave goes deeper into stablecoins with Turnkey-powered wallets for merchants

Sophos Launches Browser-Based Security Product Targeting Hybrid Work & AI Risks

Razer’s Project Ava: AI now goes in a cannister on your desk

Tech Careers in 2026 and Beyond: Inside the Jobs, Skills, and Roles Defining Africa’s Digital Future

OpenAI invests in brain-interface biz co-founded by CEO Sam Altman