Building a Reliable End-to-End Machine Learning Pipeline Using MLE-Agent and Ollama Locally

August 27, 2025

Building a Fully Local Machine Learning Pipeline with Ollama and Google Colab

In this guide, we explore how to establish a completely local, API-independent machine learning workflow by integrating Ollama with Google Colab. We begin by configuring a reproducible environment, then create a synthetic dataset for demonstration purposes. Next, we instruct an AI agent to generate a training script, carefully refining it to handle common errors, ensure proper imports, and provide a reliable fallback script. This approach guarantees a seamless, automated pipeline without sacrificing control or security.

Setting Up the Environment and Dependencies

First, we define a utility function to execute shell commands within the Colab environment. This function prints each command before running it, captures its output, and raises an exception if the command fails, allowing us to monitor progress in real time.

import os
import subprocess

def runshellcommand(command, check=True, env=None, cwd=None):
    print(f"$ {command}")
    process = subprocess.run(
        command,
        shell=True,
        env={os.environ, (env or {})} if env else None,
        cwd=cwd,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        text=True
    )
    print(process.stdout)
    if check and process.returncode != 0:
        raise RuntimeError(process.stdout)
    return process.stdout

Next, we establish directory paths and filenames for our project workspace within Colab. We install the necessary Python packages, including mle-agent, scikit-learn, pandas, numpy, and joblib. We also install Ollama locally, start its server, and pull the specified language model to enable offline code generation.

from pathlib import Path
import time

WORKDIR = Path("/content/mlecolabdemo")
WORKDIR.mkdir(parents=True, existok=True)

PROJECTDIR = WORKDIR / "proj"
PROJECTDIR.mkdir(existok=True)

DATAPATH = WORKDIR / "data.csv"
MODELPATH = WORKDIR / "model.joblib"
PREDICTIONSPATH = WORKDIR / "preds.csv"

SAFESCRIPTPATH = WORKDIR / "trainsafe.py"
RAWSCRIPTPATH = WORKDIR / "agenttrainraw.py"
FINALSCRIPTPATH = WORKDIR / "train.py"

MODELNAME = os.environ.get("OLLAMAMODEL", "llama3.2:1b")

runshellcommand("pip -q install --upgrade pip")
runshellcommand("pip -q install mle-agent==0.4. scikit-learn pandas numpy joblib")

runshellcommand("curl -fsSL https://ollama.com/install.sh | sh")
ollamaserver = subprocess.Popen("ollama serve", shell=True)
time.sleep(4)
runshellcommand(f"ollama pull {MODELNAME}")

Generating a Synthetic Dataset and Preparing the Prompt

We create a synthetic dataset with 500 samples and four features, simulating a binary classification problem. The target variable is generated using a linear combination of features with added noise, then saved as a CSV file.

import numpy as np
import pandas as pd

np.random.seed(0)
nsamples = 500
X = np.random.rand(nsamples, 4)
coefficients = np.array([0.4, -0.2, 0.1, 0.5])
noise = 0.15  np.random.randn(nsamples)
y = ((X @ coefficients + noise) > 0.55).astype(int)

df = pd.DataFrame(np.c[X, y], columns=["feature1", "feature2", "feature3", "feature4", "target"])
df.tocsv(DATAPATH, index=False)

We then configure environment variables to ensure the MLE-Agent uses Ollama locally without requiring external API keys. A precise prompt instructs the agent to generate a train.py script that reads the dataset, performs an 80/20 stratified train-test split, builds a pipeline with imputation, scaling, and logistic regression, evaluates ROC-AUC and F1 scores, prints sorted coefficient magnitudes, and saves the model and predictions.

envvars = {
    "OPENAIAPIKEY": "",
    "ANTHROPICAPIKEY": "",
    "GEMINIAPIKEY": "",
    "OLLAMAHOST": "http://127.0.0.1:11434",
    "MLELLMENGINE": "ollama",
    "MLEMODEL": MODELNAME
}

prompttext = f"""
Return ONE fenced python code block only.
Write train.py that reads {DATAPATH}; 80/20 split (randomstate=42, stratify);
Pipeline: SimpleImputer + StandardScaler + LogisticRegression(classweight='balanced', maxiter=1000, randomstate=42);
Print ROC-AUC & F1; print sorted coefficient magnitudes; save model to {MODELPATH} and preds to {PREDICTIONSPATH};
Use only sklearn, pandas, numpy, joblib; no extra text.
"""

import re

def extractpythoncode(text: str) -> str | None:
    # Remove ANSI escape sequences
    cleaned = re.sub(r"x1B[[0-?][ -/][@-~]", "", text)
    # Extract fenced python code block
    match = re.search(r"(?:python)?s([sS]?)", cleaned, re.I)
    if match:
        return match.group(1).strip()
    # Fallback: if text starts with 'python', return the rest
    if cleaned.strip().lower().startswith("python"):
        return cleaned.strip()[6:].strip()
    # Extract import statements if present
    match = re.search(r"(?:^|n)(froms+[^n]+|imports+[^n]+)([sS])", cleaned)
    if match:
        return (match.group(1) + match.group(2)).strip()
    return None

output = runshellcommand(f'printf %s "{prompttext}" | mle chat', check=False, cwd=str(PROJECTDIR), env=envvars)
generatedcode = extractpythoncode(output) or runshellcommand(f'printf %s "{prompttext}" | ollama run {MODELNAME}', check=False, env=envvars)
generatedcode = extractpythoncode(generatedcode) if generatedcode and not isinstance(generatedcode, str) else (generatedcode or "")
(RAWSCRIPTPATH).writetext(generatedcode or "", encoding="utf-8")

Refining the Generated Script for Reliability

To ensure the generated training script runs smoothly, we implement a sanitization function that removes unwanted prefixes and corrects common import errors related to scikit-learn modules. It also adds any missing essential imports to guarantee the script’s completeness.

def sanitizescript(sourcecode: str) -> str:
    if not sourcecode:
        return ""
    code = sourcecode
    code = re.sub(r"r", "", code)
    code = re.sub(r"^pythonb", "", code.strip(), flags=re.I).strip()

    corrections = {
        r"froms+sklearn.pipeliness+imports+SimpleImputer": "from sklearn.impute import SimpleImputer",
        r"froms+sklearn.preprocessingss+imports+SimpleImputer": "from sklearn.impute import SimpleImputer",
        r"froms+sklearn.pipeliness+imports+StandardScaler": "from sklearn.preprocessing import StandardScaler",
        r"froms+sklearn.preprocessingss+imports+ColumnTransformer": "from sklearn.compose import ColumnTransformer",
        r"froms+sklearn.pipeliness+imports+ColumnTransformer": "from sklearn.compose import ColumnTransformer",
    }

    for pattern, replacement in corrections.items():
        code = re.sub(pattern, replacement, code)

    if "SimpleImputer" in code and "from sklearn.impute import SimpleImputer" not in code:
        code = "from sklearn.impute import SimpleImputern" + code
    if "StandardScaler" in code and "from sklearn.preprocessing import StandardScaler" not in code:
        code = "from sklearn.preprocessing import StandardScalern" + code
    if "ColumnTransformer" in code and "from sklearn.compose import ColumnTransformer" not in code:
        code = "from sklearn.compose import ColumnTransformern" + code
    if "traintestsplit" in code and "from sklearn.modelselection import traintestsplit" not in code:
        code = "from sklearn.modelselection import traintestsplitn" + code
    if "joblib" in code and "import joblib" not in code:
        code = "import joblibn" + code

    return code

sanitizedcode = sanitizescript(generatedcode)

Additionally, we prepare a robust fallback training script that performs the entire pipeline deterministically. This script reads the dataset, applies preprocessing and logistic regression, evaluates performance metrics, prints the most influential features, and saves the model and predictions. This fallback ensures that even if the agent-generated code is flawed, the training process remains uninterrupted.

import textwrap

fallbackscript = textwrap.dedent(f"""
import pandas as pd
import numpy as np
import joblib
from pathlib import Path
from sklearn.modelselection import traintestsplit
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.linearmodel import LogisticRegression
from sklearn.metrics import rocaucscore, f1score
from sklearn.compose import ColumnTransformer

DATAPATH = Path("{DATAPATH}")
MODELPATH = Path("{MODELPATH}")
PREDICTIONSPATH = Path("{PREDICTIONSPATH}")

df = pd.readcsv(DATAPATH)
X = df.drop(columns=["target"])
y = df["target"].astype(int)

numericfeatures = X.columns.tolist()
preprocessor = ColumnTransformer([
    ("num", Pipeline([
        ("imputer", SimpleImputer()),
        ("scaler", StandardScaler())
    ]), numericfeatures)
])

classifier = LogisticRegression(classweight='balanced', maxiter=1000, randomstate=42)
pipeline = Pipeline([
    ("preprocessor", preprocessor),
    ("classifier", classifier)
])

Xtrain, Xtest, ytrain, ytest = traintestsplit(
    X, y, testsize=0.2, randomstate=42, stratify=y
)

pipeline.fit(Xtrain, ytrain)
probabilities = pipeline.predictproba(Xtest)[:, 1]
predictions = (probabilities >= 0.5).astype(int)

print("ROC-AUC:", round(rocaucscore(ytest, probabilities), 4))
print("F1 Score:", round(f1score(ytest, predictions), 4))

coefficients = pd.Series(
    pipeline.namedsteps["classifier"].coef.ravel(),
    index=numericfeatures
).abs().sortvalues(ascending=False)

print("Top coefficients by absolute magnitude:\n", coefficients.tostring())

joblib.dump(pipeline, MODELPATH)
pd.DataFrame({
    "ytrue": ytest.resetindex(drop=True),
    "yprob": probabilities,
    "ypred": predictions
}).tocsv(PREDICTIONSPATH, index=False)

print("Model and predictions saved to:", MODELPATH, PREDICTIONSPATH)
""").strip()

Executing the Final Training Script and Verifying Outputs

We determine whether to use the sanitized agent-generated script or the fallback script based on the presence of key imports and functions. Both scripts are saved for transparency. The selected script is then executed, and we display a snippet of its content along with a list of all generated files to confirm successful completion.

from pathlib import Path

finalcode = sanitizedcode if ("import " in sanitizedcode and "sklearn" in sanitizedcode and "readcsv" in sanitizedcode) else fallbackscript

Path(SAFESCRIPTPATH).writetext(fallbackscript, encoding="utf-8")
Path(FINALSCRIPTPATH).writetext(finalcode, encoding="utf-8")

print("n=== Using train.py (first 800 characters) ===n")
print(finalcode[:800], "n...")

runshellcommand(f"python {FINALSCRIPTPATH}")

artifacts = [str(p) for p in WORKDIR.glob("")]
print("nGenerated artifacts:", artifacts)
print("✅ Workflow completed successfully - outputs are located in", WORK_DIR)

Summary

This tutorial demonstrates how to seamlessly integrate local large language models with traditional machine learning pipelines, eliminating the need for external API keys and enhancing reproducibility. By combining Ollama’s local inference capabilities with automated script generation and robust sanitization, we create a dependable framework for training, evaluating, and saving models entirely offline. This method empowers practitioners to maintain full control over their workflows while leveraging the benefits of automation.

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

Building a Fully Local Machine Learning Pipeline with Ollama and Google Colab

Setting Up the Environment and Dependencies

Generating a Synthetic Dataset and Preparing the Prompt

Refining the Generated Script for Reliability

Executing the Final Training Script and Verifying Outputs

Summary

RELATED ARTICLES

The AI lab revolving door spins ever faster

AI may not need massive training data after all

A Coding Guide to Build a Procedural Memory Agent That Learns,...