Skip to content

Agentic Workflows Explained: What They Are and How They Work

An agentic workflow is a data pipeline where LLM agents make decisions at runtime — selecting tools, routing between stages, and remediating failures through reasoning. Built on LangGraph, they give data engineers self-healing pipelines that adapt to what they find in the data, not just what you anticipated at design time.

A Minimal Agentic Pipeline

from langgraph.graph import StateGraph, END
from langchain_core.tools import tool

# 1. Define a tool the agent can call
@tool
def count_rows(table: str) -> int:
    """Count rows in a table."""
    return db.execute(f'SELECT COUNT(*) FROM {table}'  ).scalar()

# 2. Build a one-node graph
agent = llm.bind_tools([count_rows])
def agent_node(state):
    return {"messages": [agent.invoke(state["messages"  )]}

graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.set_entry_point("agent")
app = graph.compile()

# 3. Run it
result = app.invoke({"messages": [HumanMessage("How many rows in orders?"  )]})

Core Concepts

Agent

An LLM-powered node in the graph that receives the current state (message history + metadata), reasons about what to do, calls one or more tools, and returns an updated state. Agents are stateless functions — all memory lives in the shared graph state, not inside the agent.

Framework: LangGraph nodes · LangChain tool binding · GPT-4 / Claude reasoning

Tool

A typed Python function decorated with @tool that an agent can call. The function signature and docstring are the tool's schema — the LLM reads them to decide when to call it and what arguments to pass. Tools are deterministic; only the selection decision involves LLM reasoning.

Examples: run_sql_query · fetch_api · validate_schema · write_to_s3 · send_alert

State + Supervisor

All agents share a TypedDict state containing messages, task metadata, and results. The supervisor agent reads this state after every worker run and routes to the next node or END. State is serialized to Redis after each step via LangGraph's checkpointer — enabling crash recovery without restarting from scratch.

Checkpointing: Redis (prod) · MemorySaver (local dev) · SqliteSaver (tests)

Key LangGraph Concepts

ConceptWhat it isAnalogy
StateGraphThe directed graph that defines your pipelineThe DAG definition in Airflow
NodeAn agent function registered in the graphAn Airflow operator
EdgeA connection between two nodesA task dependency in Airflow
Conditional edgeRoutes to different nodes based on stateBranchPythonOperator
CheckpointerPersists state after every stepAirflow task instance XCom
recursion_limitMax steps before the graph raises an errormax_active_runs in Airflow

Common Mistakes

Putting all logic in one agent

One mega-agent with 20 tools is hard to test, debug, and improve. Decompose into specialist agents (ingest, validate, transform) and a supervisor that routes between them.

Skipping tool docstrings

The LLM selects tools entirely based on their docstrings. Vague docs (e.g., "does database stuff") cause wrong tool selection. Write precise, example-driven docstrings.

No max_iterations guard

Without recursion_limit and an error_count guard in your state, a confused agent will loop until you hit the API rate limit or run out of money.

FAQ

What is an agentic workflow?
A data pipeline where LLM agents make decisions at runtime — selecting tools, routing work, and remediating failures through reasoning. Built on LangGraph, they adapt to runtime conditions instead of following a fixed pre-coded DAG.
What is a tool in an agentic workflow?
A typed Python function decorated with @tool that an agent can call. The agent reads the docstring to decide when to use it. Tools are deterministic — only the selection involves LLM reasoning.
What is agent state in LangGraph?
A shared TypedDict that all agents read and write. Serialized to Redis after every node. If the pipeline crashes, you resume from the last checkpoint — no steps are re-executed.
What is a supervisor agent?
The orchestrator node. It reads worker output and routes to the next worker, retries, or returns FINISH. It uses LLM reasoning for routing — unlike Airflow's static conditional branching.

Related

Press Cmd+K to open