Agentic Workflows Explained: What They Are and How They Work
An agentic workflow is a data pipeline where LLM agents make decisions at runtime — selecting tools, routing between stages, and remediating failures through reasoning. Built on LangGraph, they give data engineers self-healing pipelines that adapt to what they find in the data, not just what you anticipated at design time.
A Minimal Agentic Pipeline
from langgraph.graph import StateGraph, END
from langchain_core.tools import tool
# 1. Define a tool the agent can call
@tool
def count_rows(table: str) -> int:
"""Count rows in a table."""
return db.execute(f'SELECT COUNT(*) FROM {table}' ).scalar()
# 2. Build a one-node graph
agent = llm.bind_tools([count_rows])
def agent_node(state):
return {"messages": [agent.invoke(state["messages" )]}
graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.set_entry_point("agent")
app = graph.compile()
# 3. Run it
result = app.invoke({"messages": [HumanMessage("How many rows in orders?" )]})
Core Concepts
Agent
An LLM-powered node in the graph that receives the current state (message history + metadata), reasons about what to do, calls one or more tools, and returns an updated state. Agents are stateless functions — all memory lives in the shared graph state, not inside the agent.
Tool
A typed Python function decorated with @tool that an agent can call. The function signature and docstring are the tool's schema — the LLM reads them to decide when to call it and what arguments to pass. Tools are deterministic; only the selection decision involves LLM reasoning.
State + Supervisor
All agents share a TypedDict state containing messages, task metadata, and results. The supervisor agent reads this state after every worker run and routes to the next node or END. State is serialized to Redis after each step via LangGraph's checkpointer — enabling crash recovery without restarting from scratch.
Key LangGraph Concepts
| Concept | What it is | Analogy |
|---|---|---|
| StateGraph | The directed graph that defines your pipeline | The DAG definition in Airflow |
| Node | An agent function registered in the graph | An Airflow operator |
| Edge | A connection between two nodes | A task dependency in Airflow |
| Conditional edge | Routes to different nodes based on state | BranchPythonOperator |
| Checkpointer | Persists state after every step | Airflow task instance XCom |
| recursion_limit | Max steps before the graph raises an error | max_active_runs in Airflow |
Common Mistakes
Putting all logic in one agent
One mega-agent with 20 tools is hard to test, debug, and improve. Decompose into specialist agents (ingest, validate, transform) and a supervisor that routes between them.
Skipping tool docstrings
The LLM selects tools entirely based on their docstrings. Vague docs (e.g., "does database stuff") cause wrong tool selection. Write precise, example-driven docstrings.
No max_iterations guard
Without recursion_limit and an error_count guard in your state, a confused agent will loop until you hit the API rate limit or run out of money.
FAQ
- What is an agentic workflow?
- A data pipeline where LLM agents make decisions at runtime — selecting tools, routing work, and remediating failures through reasoning. Built on LangGraph, they adapt to runtime conditions instead of following a fixed pre-coded DAG.
- What is a tool in an agentic workflow?
- A typed Python function decorated with @tool that an agent can call. The agent reads the docstring to decide when to use it. Tools are deterministic — only the selection involves LLM reasoning.
- What is agent state in LangGraph?
- A shared TypedDict that all agents read and write. Serialized to Redis after every node. If the pipeline crashes, you resume from the last checkpoint — no steps are re-executed.
- What is a supervisor agent?
- The orchestrator node. It reads worker output and routes to the next worker, retries, or returns FINISH. It uses LLM reasoning for routing — unlike Airflow's static conditional branching.