Skip to main content
janwillemaltink.

Sticker Generator with n8n and Gemini

I build a small n8n workflow to generate “die-cut” sticker style pgns for my new blog. The core approach, using chromakey green backgrounds and HSV-based removal, comes from Phil Schmid’s excellent post. I adapted his technique for n8n with a lighter dependency footprint suitable for VPS deployment.

Jump between sections with ⌘⇧J .

1. What We’re Building

The workflow accepts a text prompt via webhook, generates a sticker image using Google Gemini, removes the background to create transparency, and stores both the original and processed versions in Supabase Storage. The final result is a signed URL pointing to a transparent PNG.

StepComponentPurpose
1WebhookReceive prompt via POST request
2GeminiGenerate sticker image with chromakey background
3SupabaseStore original, provide URL for Python to fetch
4PythonRemove green background, create transparency
5SupabaseStore final transparent PNG

The fun parts: the task runners setup (which allows running Python with external packages), Supabase as an intermediate storage layer (which solves n8n’s Python binary input limitation), and the chromakey approach for background removal. Let me walk through each component.

2. Setting Up n8n Task Runners

n8n recently introduced task runners as a way to execute code nodes in isolated containers. This is useful for two reasons: security isolation and the ability to install custom packages. The default n8n Code node runs in a sandboxed environment with limited libraries. Task runners let you break out of that limitation.

2.1 The Docker Compose Setup

The setup requires two services: the main n8n instance and a separate task runners container. Here’s the relevant part of the compose file:

yaml
services:
  n8n:
    image: docker.n8n.io/n8nio/n8n:latest
    container_name: n8n
    restart: always
    environment:
      # ... database config ...
      # Task runners configuration
      - N8N_RUNNERS_ENABLED=true
      - N8N_RUNNERS_MODE=external
      - N8N_RUNNERS_BROKER_LISTEN_ADDRESS=0.0.0.0
      - N8N_RUNNERS_AUTH_TOKEN=${N8N_RUNNERS_AUTH_TOKEN}
    ports:
      - "5678:5678"
    networks:
      - internal

  n8n-runners:
    build:
      context: ./runners
      dockerfile: Dockerfile
    image: n8n-runners:custom
    container_name: n8n-runners
    restart: always
    environment:
      - N8N_RUNNERS_TASK_BROKER_URI=http://n8n:5679
      - N8N_RUNNERS_AUTH_TOKEN=${N8N_RUNNERS_AUTH_TOKEN}
    depends_on:
      - n8n
    networks:
      - internal

The key settings are N8N_RUNNERS_ENABLED=true and N8N_RUNNERS_MODE=external. This tells n8n to delegate code execution to the external runners container instead of running it internally.2The auth token is shared between n8n and the runners container for secure communication. Generate one with openssl rand -hex 32 and store it in your .env file.

2.2 Custom Runners Dockerfile

The default runners image comes with basic Python and JavaScript support, but we need additional packages for image processing. I created a simple Dockerfile for the custom runner:

dockerfile
FROM n8nio/runners:latest

USER root

# Install JavaScript packages
RUN cd /opt/runners/task-runner-javascript && pnpm add \
  zod \
  lodash

# Install Python packages
RUN cd /opt/runners/task-runner-python && uv pip install \
  numpy \
  pillow \
  pydantic \
  requests

# Copy custom task runners config
COPY n8n-task-runners.json /etc/n8n-task-runners.json

USER runner

For this workflow you only need numpy, requests and pillow (PIL).

2.3 Runner Configuration

The n8n-task-runners.json file controls what modules are allowed in code nodes. This is important because even with packages installed, n8n restricts what can be imported by default:

json
{
  "task-runners": [
    {
      "runner-type": "javascript",
      "workdir": "/home/runner",
      "command": "/usr/local/bin/node",
      "args": [
        "--disallow-code-generation-from-strings",
        "--disable-proto=delete",
        "/opt/runners/task-runner-javascript/dist/start.js"
      ],
      "env-overrides": {
        "NODE_FUNCTION_ALLOW_BUILTIN": "crypto",
        "NODE_FUNCTION_ALLOW_EXTERNAL": "moment,zod,lodash"
      }
    },
    {
      "runner-type": "python",
      "workdir": "/home/runner",
      "command": "/opt/runners/task-runner-python/.venv/bin/python",
      "args": ["-m", "src.main"],
      "env-overrides": {
        "PYTHONPATH": "/opt/runners/task-runner-python",
        "N8N_RUNNERS_STDLIB_ALLOW": "base64,io,json,re,datetime,typing,collections",
        "N8N_RUNNERS_EXTERNAL_ALLOW": "numpy,PIL,pydantic,requests"
      }
    }
  ]
}

The N8N_RUNNERS_EXTERNAL_ALLOW setting is what unlocks numpy and PIL for import. Without this, the code node would reject the imports even though the packages are installed.

3. Supabase as Intermediate Storage

3.1 The Python Binary Problem

When I was setting this up, i ran into the following issue: Python code nodes cannot directly receive binary data as input. You can output binary data from Python (with the right return format), but you can’t pass an image directly into a Python node from a previous step. This is by design since n8n switched to native Python for security reasons.

This creates a problem for image processing workflows. The Gemini node outputs an image, but Python can’t consume it directly. You need an intermediate step.

My solution was to use Supabase Storage as a handoff layer. Upload the image, get a signed URL, and have Python fetch it via HTTP. The signed URL is valid for 24 hours, which gives plenty of time for the workflow to complete while also allowing manual inspection if needed.

3.2 Modularity Over Complexity

I have a general philosophy with n8n: nodes should be simple and modular, not complex monoliths. When I find myself building elaborate logic inside a single workflow, that’s usually a sign I should extract it into a sub-workflow.

Supabase Storage requires multiple steps: determine the bucket, generate a filename, upload via HTTP, handle errors, generate a signed URL. That’s a lot of logic to duplicate every time I need to store a file. So I wrapped it in a reusable sub-workflow that any other workflow can call.

The sub-workflow accepts binary data as input and returns:

  • The storage path
  • A signed URL valid for 24 hours
  • Success/error status

This pattern pays off quickly. The storage uploader now gets called from multiple workflows, not just this sticker generator.

Here’s the complete sub-workflow visualized:

Mermaid diagram Mermaid diagram

Storage Uploader sub-workflow with error handling branches (visualized with n8nmermaid)

3.3 Automatic Bucket Selection

The first step determines which bucket to use based on the file’s MIME type:

javascript
const binary = $binary["data"];
const mimeType = binary?.mimeType?.toLowerCase() || '';
const fileName = binary?.fileName || '';
const ext = fileName.split('.').pop().toLowerCase();

// Use MIME type if available
if (mimeType && mimeType !== 'application/octet-stream') {
  if (mimeType.startsWith('image/')) return 'image-files';
  if (mimeType.startsWith('audio/')) return 'audio-files';
  if (mimeType.startsWith('video/')) return 'video-files';
}

// Fallback to extension
if (['jpg', 'jpeg', 'png', 'gif', 'bmp', 'webp'].includes(ext)) return 'image-files';
if (['mp3', 'wav', 'ogg', 'flac', 'aac', 'm4a'].includes(ext)) return 'audio-files';
if (['mp4', 'mov', 'avi', 'mkv', 'webm'].includes(ext)) return 'video-files';

return 'document-files';

This is set up in a Set node with an expression. The buckets need to exist in Supabase beforehand.

3.4 Filename Generation

I generate timestamped filenames to avoid collisions:

javascript
const pad = n => n.toString().padStart(2,'0');
const now = new Date();
const date = `${now.getFullYear()}${pad(now.getMonth()+1)}${pad(now.getDate())}`;
const time = `${pad(now.getHours())}${pad(now.getMinutes())}${pad(now.getSeconds())}`;

return `${category}-${date}-${time}.${ext}`;
// Example: image-20260125-143052.png

3.5 Upload and Sign

The actual upload uses Supabase’s REST API:

plaintext
POST https://[your-project]/storage/v1/object/{bucket}/{filename}

With the Supabase API credentials configured in n8n, the HTTP Request node handles authentication automatically. After upload, a second request generates the signed URL:

plaintext
POST https://[your-project]/storage/v1/object/sign/{bucket}/{filename}
Body: { "expiresIn": 86400 }

The sub-workflow has error handling branches for both the upload and signing steps, returning appropriate status codes.

4. The Main Sticker Workflow

With the infrastructure in place, the main workflow is straightforward: receive prompt, generate image, process it, store it.

Mermaid diagram Mermaid diagram

Main sticker workflow: webhook to final storage

4.1 Webhook Trigger

The workflow starts with a webhook that accepts POST requests:

plaintext
POST /webhook/sticker
Body: { "prompt": "a happy cat" }

The prompt flows into the Gemini node via expression: {{ $json.body.prompt }}

4.2 Gemini Image Generation

Google Gemini can generate images, but out of the box it can’t create images with a transparant background.

The key insight, which I took from his post: instead of trying to generate transparent images directly (which Gemini can’t do) or running expensive ML-based background removal, generate images with a solid chromakey green background. This is the same technique used in video production. Then remove the green programmatically. It’s faster and cheaper than running a separate segmentation model.

Here’s the prompt template (adapted from Phil’s original):

plaintext
Create a sticker illustration of: {{ $json.body.prompt }}

CRITICAL CHROMAKEY REQUIREMENTS:
1. BACKGROUND: Solid, flat, uniform chromakey green color.
   Use EXACTLY hex color #00FF00 (RGB 0, 255, 0).
   The entire background must be this single pure green color
   with NO variation, NO gradients, NO shadows.
2. WHITE OUTLINE: The subject MUST have a clean white outline/border
   (2-3 pixels wide) separating it from the green background.
3. NO GREEN ON SUBJECT: The subject itself should NOT contain any green colors.
4. SHARP EDGES: Crisp, sharp, well-defined edges.
5. CENTERED: Subject should be centered with padding around all sides.
6. STYLE: Vibrant, clean, cartoon/illustration sticker style with bold colors.

The white outline requirement is important. It creates a visual buffer between the subject and the green background, which makes the chromakey removal cleaner. Without it, you get edge artifacts.3Gemini doesn’t always follow these instructions perfectly. Sometimes you get slight green tints or imperfect edges. The HSV-based removal handles most of these cases, but heavily green subjects (like a frog) can be problematic.

I’m using gemini-3-pro-image-preview,

4.3 The Supabase Bridge

Next, we are going to use the intermediate storage pattern from Section 3. After Gemini generates the image, we immediately upload it to Supabase via the storage sub-workflow. This does three things:

  1. Enables Python processing. As discussed, Python nodes can’t receive binary input directly. By uploading first and getting a signed URL, Python can fetch the image via HTTP request.
  2. Creates a backup. If the background removal fails or produces poor results, the original is still available for debugging or manual processing.
  3. Provides a stable reference. The signed URL is valid for 24 hours. This gives plenty of time for the workflow to complete, and I can inspect intermediate results if something goes wrong.

The flow is: Gemini outputs binary → upload to Supabase → get signed URL → Python fetches from URL → processes → outputs new binary → upload final result to Supabase.

It’s a bit roundabout compared to direct binary passing, but it’s reliable and has the side benefit of preserving intermediate artifacts. It’s also quick, the uploading bit took almost no time in comparison to the actual generation time the model needs.

5. The Background Removal Code

This is the core image processing logic, I adapted Phil Schmid’s implementation slightly. Phil’s version uses scipy for morphological operations (edge cleanup via binary_dilation), which works great but adds a heavier dependency. Since I’m running this inside n8n task runners on a VPS, I wanted to keep the container size down. I swapped scipy for PIL’s built-in ImageFilter.MaxFilter, which achieves a similar effect for this use case.

5.1 The Full Code

python
import base64
import io
import numpy as np
from PIL import Image, ImageFilter
import requests

# Fetch image from S3 signed URL
image_url = _items[0]["json"]["full_signedURL"]
response = requests.get(image_url)
img = Image.open(io.BytesIO(response.content)).convert("RGBA")

# HSV-based green mask detection
def get_green_mask(image):
    arr = np.array(image).astype(np.float32) / 255.0
    r, g, b = arr[:, :, 0], arr[:, :, 1], arr[:, :, 2]

    max_c = np.maximum(np.maximum(r, g), b)
    min_c = np.minimum(np.minimum(r, g), b)
    delta = max_c - min_c

    h = np.zeros_like(max_c)
    mask_g = (max_c == g) & (delta != 0)
    h[mask_g] = (60 * ((b[mask_g] - r[mask_g]) / delta[mask_g]) + 120)
    h[h < 0] += 360

    s = np.zeros_like(max_c)
    s[max_c != 0] = delta[max_c != 0] / max_c[max_c != 0]

    green_mask = (
        (np.abs(h - 120) < 25) &
        (s > 0.75) &
        (max_c > 0.70)
    )
    return green_mask

# Apply mask and cleanup
mask = get_green_mask(img)
mask_img = Image.fromarray((mask * 255).astype(np.uint8), mode='L')
cleaned_mask = mask_img.filter(ImageFilter.MaxFilter(3))

# Make green pixels transparent
data = np.array(img)
data[..., 3] = np.where(np.array(cleaned_mask) > 0, 0, 255)
final_img = Image.fromarray(data)

# Output as PNG
img_byte_arr = io.BytesIO()
final_img.save(img_byte_arr, format='PNG')
img_byte_arr.seek(0)

return [{
    "json": {"filename": "sticker.png"},
    "binary": {
        "data": {
            "data": base64.b64encode(img_byte_arr.getvalue()).decode('utf-8'),
            "mimeType": "image/png",
            "fileName": "sticker.png"
        }
    }
}]

5.2 HSV Over Simple RGB

You might think: just check if a pixel is green (r < threshold and g > threshold and b < threshold). This works for perfect chromakey green but fails for variations. Gemini doesn’t produce pixel-perfect #00FF00. There are slight gradients, compression artifacts, and anti-aliasing at edges.

The original blog clearly explains why we should use HSV color space instead. HSV (Hue, Saturation, Value) is better because:

  • Hue identifies the color independent of brightness. Green is around 120 degrees.
  • Saturation tells us how “pure” the color is. High saturation means vivid green, not grayish-green.
  • Value (brightness) helps exclude dark shadows that might have greenish hues.

The mask logic:

  • Hue within 25 degrees of pure green (120°)
  • Saturation above 75% (strongly colored, not washed out)
  • Value above 70% (reasonably bright)

This catches the chromakey background while leaving greenish tints on the subject alone.

5.3 The MaxFilter Cleanup

After creating the mask, we apply a MaxFilter(3). This expands the mask slightly, which helps remove edge pixels that are partially green due to anti-aliasing. Without this, you often get a thin green fringe around the subject.

5.4 Returning Binary Data

The n8n Python code node expects a specific format for binary output. The return value must be a list of items, each with json and binary properties. The binary data needs to be base64-encoded with explicit MIME type and filename.

6. Example Result

Let me show what this actually produces. Here’s a real request:

bash
curl -X POST https://n8n.janwillemaltink.com/webhook-test/sticker \
  -H "Content-Type: application/json" \
  -d '{"prompt": "a cute happy cat"}'

Gemini generates the image with the chromakey green background:

Gemini output with green background

Gemini output: sticker illustration with chromakey green background

After the Python processing removes the green and creates transparency:

Final transparent sticker

Final result: transparent PNG ready to use

The complete flow:

plaintext
[Webhook]
    → [Gemini: Generate Image]
    → [Sub-workflow: Upload Original to Supabase]
    → [Python: Remove Background]
    → [Sub-workflow: Upload Final PNG to Supabase]
    → [Return Signed URL]

The webhook is configured with responseMode: "lastNode", so the final signed URL is returned directly to the caller.

7. What’s Next

The current setup works but has room for improvement. A few ideas:

  • Queue system: For high volume, the synchronous webhook could timeout. Moving to an async pattern with status polling would be more robust.
  • Multiple output sizes: Generate different resolutions for different use cases.
  • Style presets: Pre-configured prompt modifiers for different sticker aesthetics (kawaii, minimalist, etc.).
  • Edge refinement: More sophisticated edge detection could improve the mask quality on complex subjects.

For now though, it does what I needed: text prompt in, transparent sticker out.

I submitted both the Supabase and the Gemini workflow in n8n as templates, and I will add them to this post as soon as they are accepted.

- Jan Willem