Team · Stack · Partners

Built by people you can actually message back.

Three people in Bali, the best AI stack money can rent, and a network of protocol partners building the open agent economy. No fake names. No fake logos. No fake numbers.

Humans on the team

AI models in the stack

Protocol partners

Founder you can DM

The founders

Three founders. One bet.

Small enough that you'll talk to a founder on day one. Experienced enough that we've shipped to production before.

miroko@agnt:~$./team --id 01

> thinking product

Product & Vision

Miroko

const miroko = {

role:"product · vision · design",

agents:"venue · dupe · calorie",

skills:"UX · strategy · ops",

ships:"since 2015",

status:"online · 24/7",

}

// Killing the home screen

[X][in]

ernesto@agnt:~$./team --id 02

> reviewing PRs

Agent Architecture

Ernesto

const ernesto = {

role:"agent loop · routing",

agents:"orchestrator · memory",

skills:"Python · TS · Rust",

shipped:"47 prod systems",

status:"deep work mode",

}

// Reading Postgres EXPLAIN plans

deniz@agnt:~$./team --id 03

> watching Sentry

Infrastructure

Deniz

const deniz = {

role:"infra · payments · auth",

agents:"transport · booking",

skills:"Go · TS · K8s yaml",

uptime:"99.97% / 90d",

status:"watching Sentry",

}

// Webhook idempotency

Want to be number 4?

We're hiring engineers who like agents, protocols, and Bali.

Email the founder

The brain

12 models. One router. One chat.

We route every request to the model that actually does it best. Frontier reasoning for hard questions, open weights for cost, dedicated infra for speed. The user never knows which brain answered — they just get the answer.

Tier 1 · Frontier

Claude

Primary reasoning · default brain

Tier 1 · Frontier

GPT-5

Tool calling · code · fallback

Tier 1 · Frontier

Gemini

Long context · vision · live data

Tier 2 · Open & specialized

Llama

Open baseline

Mistral

Multilingual edge

Hermes

Open agent fine-tunes

DeepSeek

Cost-efficient reasoning

Qwen

Asian language coverage

Tier 3 · Inference & compute

Groq

Sub-second inference

Together AI

Open model hosting

Perplexity

Live web grounding

NVIDIA

GPU compute

agnt://router

PARSE

ROUTE

INFER

RESPOND

live

👤

YOU

“dinner Sat 6pm, sunset table”

reasoning

AGNTROUTER

CLD

Claude

IDLE

GPT

GPT-5

IDLE

GEM

Gemini

IDLE

GRQ

Groq

IDLE

PPLX

Perplexity

IDLE

MST

Mistral

IDLE

LATENCY

247ms

TOKENS

1432

COST

$0.00021

MODEL

Claude

REGION

ap-se-1

trace.waterfalltail -f

● tracingnow

dinner Sat 6pm, sunset table

PARSE

ROUTE

INFER

RESPOND

→ Claude

avg latency

198ms

cost / 1k

$0.14

The hands

Memory. Retrieval. Tool use that actually ships.

A model alone is just a chatbot. The hard part is everything around it — long-term memory, hybrid retrieval, embeddings, reranking, multi-agent orchestration. We use the best tool for each layer instead of trying to build it all ourselves.

agnt://memory · pipeline

Every chat makes it smarter about you.

example 1/3live

Message

"I like sunset dinners"

AGNT

Embed

Voyage AI

Store

Pinecone

Retrieve

Mem0

AGNT

Input: "I like sunset dinners"

1/5 stages

LangChain

Agent orchestration

LangGraph

Stateful graphs

LlamaIndex

RAG pipelines

Mem0

Long-term memory

Pinecone

Vector search

Voyage AI

Best-in-class embeddings

Cohere Rerank

Result reranking

Hugging Face

Model hub

The network

Four pillars. One open agent stack.

AGNT runs on open agent infrastructure that we either built ourselves or contribute back to. Identity, reputation, messaging, runtime — every layer is open and verifiable. No black boxes.

Identity

OpenClaw

Agent DNA & passport.

The open identity layer for agents. Every agent on AGNT has an OpenClaw DNA — a verifiable identity, signing key, and capability manifest. Like a passport for software that acts on your behalf.

Reputation

NemoClaw

Built on NVIDIA NeMo.

The reputation graph. Every completed booking, tool call, and rating feeds NemoClaw — a NVIDIA NeMo-powered model that scores agent trustworthiness so users (and other agents) know who to deal with.

⚡ Powered by NVIDIA NeMo

Messaging

ClawPulse

The A2A messaging server.

Our open A2A messaging server. Routes envelopes between agents in real-time, handles delivery guarantees, retries, fallbacks, and discovery. The TCP/IP of agent-to-agent communication — built by us, open to anyone.

Runtime

Hermes Agents

Open agent fine-tunes.

Open-weight agent runtime from Nous Research. Hermes models are fine-tuned specifically for tool calling, structured output, and long-running agent loops — what we use when we need to run agents off the frontier APIs.

Built by AGNT · open to anyone

ClawPulse is our A2A messaging server. Open protocol, open server, MIT license. Run your own instance or use ours.

ClawPulse on GitHub →

The compute · we own the silicon

We don't rent GPUs. We own them.

While everyone else queues for OpenAI quota, we run our own NVIDIA fleet — Blackwell, Hopper, Ampere. 96 chips, 9.4 TB of HBM, 32 PFLOPS of inference. The agent layer doesn't share.

NVIDIA GPUs

9.4 TB

HBM memory

32 PFLOPS

FP8 inference

0ms

Queue time

NNVIDIA · Blackwell

×8

B200

Frontier reasoning · multi-modal

HBM

192 GB

B/W

8 TB/s

FP8

20 PFLOPS

NNVIDIA · Hopper Refresh

×16

H200

Long-context inference · 1M tokens

HBM

141 GB

B/W

4.8 TB/s

FP8

4 PFLOPS

NNVIDIA · Hopper

×32

H100

Production inference · the workhorse

HBM

80 GB

B/W

3.35 TB/s

FP8

2 PFLOPS

NNVIDIA · Grace Hopper

×4

GH200

Vector DB + LLM fused on one die

HBM

144 GB

B/W

4 TB/s

FP8

1 PFLOPS

NNVIDIA · Ampere

×24

A100

Embeddings · fine-tuning · RAG

HBM

80 GB

B/W

2 TB/s

FP8

0.6 PFLOPS

NNVIDIA · Ada Lovelace

×12

L40S

Vision · speech · diffusion

HBM

48 GB

B/W

864 GB/s

FP8

0.7 PFLOPS

Why it matters: When OpenAI throttles, we don't. When Anthropic queues, we don't. The agent layer runs on metal we own — colocated in Singapore + Jakarta. Sub-50ms to every venue in SEA.

The plumbing