otp-thinking

Original：🇺🇸 English

Translated

This skill should be used when the user asks to "add background processing", "cache this data", "run this async", "handle concurrent requests", "manage state across requests", "process jobs from a queue", "this GenServer is slow", or mentions GenServer, Supervisor, Agent, Task, Registry, DynamicSupervisor, handle_call, handle_cast, supervision trees, fault tolerance, "let it crash", or choosing between Broadway and Oban.

3installs

Sourcegsmlg-dev/code-agent

Added on2026-04-27

NPX Install

npx skill4agent add gsmlg-dev/code-agent otp-thinking

SKILL.md Content

View Translation Comparison →

OTP Thinking

Paradigm shifts for OTP design. These insights challenge typical concurrency and state management patterns.

The Iron Law

GENSERVER IS A BOTTLENECK BY DESIGN

A GenServer processes ONE message at a time. Before creating one, ask:

Do I actually need serialized access?
Will this become a throughput bottleneck?
Can reads bypass the GenServer via ETS?

The ETS pattern: GenServer owns ETS table, writes serialize through GenServer, reads bypass it entirely with

:read_concurrency

.

No exceptions: Don't wrap stateless functions in GenServer. Don't create GenServer "for organization".

GenServer Patterns

Function	Use For
`call/3`	Synchronous requests expecting replies
`cast/2`	Fire-and-forget messages

When in doubt, use
call
to ensure back-pressure. Set appropriate timeouts for

call/3

.

Use

handle_continue/2

for post-init work—keeps

init/1

fast and non-blocking.

Task.Supervisor, Not Task.async

Task.async

spawns a linked process—if task crashes, caller crashes too.

Pattern	On task crash
`Task.async/1`	Caller crashes (linked, unsupervised)
`Task.Supervisor.async/2`	Caller crashes (linked, supervised)
`Task.Supervisor.async_nolink/2`	Caller survives, can handle error

Use Task.Supervisor for: Production code, graceful shutdown, observability,

async_nolink

. Use Task.async for: Quick experiments, scripts, when crash-together is acceptable.

DynamicSupervisor + Registry = Named Dynamic Processes

DynamicSupervisor only supports

:one_for_one

(dynamic children have no ordering). Use Registry for names—never create atoms dynamically:

elixir

defp via_tuple(id), do: {:via, Registry, {MyApp.Registry, id}}

PartitionSupervisor scales DynamicSupervisor for millions of children.

:pg for Distributed, Registry for Local

Tool	Scope	Use Case
Registry	Single node	Named dynamic processes
:pg	Cluster-wide	Process groups, pub/sub

:pg

replaced deprecated

:pg2

. Horde provides distributed supervisor/registry with CRDTs.

Broadway vs Oban: Different Problems

Tool	Use For
Broadway	External queues (SQS, Kafka, RabbitMQ) — data ingestion with batching
Oban	Background jobs with database persistence

Broadway is NOT a job queue.

Broadway Gotchas

Processors are for runtime, not code organization. Dispatch to modules in

handle_message

, don't add processors for different message types.

one_for_all is for Broadway bugs, not your code. Your

handle_message

errors are caught and result in failed messages, not supervisor restarts.

Handle expected failures in the producer (connection loss, rate limits). Reserve max_restarts for unexpected bugs.

Supervision Strategies Encode Dependencies

Strategy	Children Relationship
:one_for_one	Independent
:one_for_all	Interdependent (all restart)
:rest_for_one	Sequential dependency

Use

:max_restarts

and

:max_seconds

to prevent restart loops.

Think about failure cascades BEFORE coding.

Abstraction Decision Tree

Need state?
├── No → Plain function
└── Yes → Complex behavior?
    ├── No → Agent
    └── Yes → Supervision?
        ├── No → spawn_link
        └── Yes → Request/response?
            ├── No → Task.Supervisor
            └── Yes → Explicit states?
                ├── No → GenServer
                └── Yes → GenStateMachine

Storage Options

Need	Use
Memory cache	ETS ( `:read_concurrency` for reads)
Static config	:persistent_term (faster than ETS)
Disk persistence	DETS (2GB limit)
Transactions/Distribution	Mnesia

:sys Debugs ANY OTP Process

elixir

:sys.get_state(pid)        # Current state
:sys.trace(pid, true)      # Trace events (TURN OFF when done!)

Telemetry Is Built Into Everything

Phoenix, Ecto, and most libraries emit telemetry events. Attach handlers:

elixir

:telemetry.attach("my-handler", [:phoenix, :endpoint, :stop], &handle/4, nil)

Use

Telemetry.Metrics

+ reporters (StatsD, Prometheus, LiveDashboard).

Red Flags - STOP and Reconsider

GenServer wrapping stateless computation
Task.async in production when you need error handling
Creating atoms dynamically for process names
Single GenServer becoming throughput bottleneck
Using Broadway for background jobs (use Oban)
Using Oban for external queue consumption (use Broadway)
No supervision strategy reasoning

Any of these? Re-read The Iron Law and use the Abstraction Decision Tree.

otp-thinking

NPX Install

Tags

SKILL.md Content

OTP Thinking

The Iron Law

GenServer Patterns

Task.Supervisor, Not Task.async

DynamicSupervisor + Registry = Named Dynamic Processes

:pg for Distributed, Registry for Local

Broadway vs Oban: Different Problems

Broadway Gotchas

Supervision Strategies Encode Dependencies

Abstraction Decision Tree

Storage Options

:sys Debugs ANY OTP Process

Telemetry Is Built Into Everything

Red Flags - STOP and Reconsider