🏠

the graph data model

you write nested. Zef stores flat. the glue between.

the eternal OOP / SQL divorce

In code, we think nested:

class User:
    name: str
    emails: list[str]

alice = User(name='Alice', emails=['a@x', 'a@y'])

In SQL, we must shred it flat β€” emails can't live in a column, so we make a second table and wire them with foreign keys.

users table emails table β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ id β”‚ name β”‚ β”‚ user_id β”‚ email β”‚ β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ 1 β”‚Alice β”‚ β”‚ 1 β”‚ a@x β”‚ β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”˜ β”‚ 1 β”‚ a@y β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β†’ N+1 queries, ORM plumbing, migrations every time the shape changes.

zef's answer: write nested, store graph

the grand trick

You write:

ET.User(name='Alice', emails_=['a@x', 'a@y'])

Zef stores:

(User) ──[name]────▢ ("Alice") β”‚ β”œβ”€β”€[emails]─────▢ ("a@x") └──[emails]─────▢ ("a@y")

Distinct nodes for the User, for "Alice", for each email. The User node is a bare identity; the edges carry the information.

Mental model rules:

entities vs values

entity nodevalue node
roleidentity anchordata carrier
e.g.ET.User, ET.Page"Alice", 42, True
internal structurenone β€” bare identitythe data
two equal ones are…still distinct (different identity)the same node (by value)

This is the Wittgenstein-Tractatus model: "objects are simple" β€” they carry no intrinsic structure. Structure emerges from the edges between them.

relations carry everything

In Zef, the thing an entity has isn't stored inside it. It's stored on the edge out of it. An edge carries:

(source entity) ──[RT.relation_name]──▢ (target node)

RT is "relation type", the cousin of ET. Same rules apply β€” you invent relation types on the fly.

the "zero, one, infinity" rule

Physically, every field is a set of outgoing edges. It might have 0, 1, or 37 edges β€” same storage mechanism.

user | Out[RT.nickname] | collect   # []    no edges
user | Out[RT.name]     | collect   # ['Alice']   one edge
user | Out[RT.email]    | collect   # ['a@x', 'a@y']   two edges

no migrations for cardinality changes

Going from "one name" to "many names" isn't a schema change. The storage is already "many." You just start writing / reading the plural form.

Contrast with SQL: ALTER TABLE users DROP COLUMN name; CREATE TABLE user_names (...); plus a data migration plus ORM retooling.

building a graph

A Graph is a value (like a list). You construct one from entity declarations:

g = Graph([
    ET.Person(
        'πŸƒ-01abc...',
        name='Alice',
        lives_in=ET.City(
            'πŸƒ-02def...',
            name='Berlin',
            population=3_600_000,
        ),
        friend_=[
            ET.Person('πŸƒ-03ghi...', name='Bob'),
            ET.Person('πŸƒ-04jkl...', name='Carol'),
        ],
    ),
])

# commit to the graph store (so we can query it later)
g.add_to_graph_store()

Zef normalizes this tree of declarations into distinct entities + edges. The "Berlin" city is a single node, even if 100 Persons live there. Same for "Alice" as a value β€” reused across wherever it appears.

reading back

# all Persons in the graph
people = g | all(ET.Person) | collect

# find Alice
alice = g | all(ET.Person) | filter(F.name == 'Alice') | first | collect

# her friends' names
names = alice | Fs.friend | map(F.name) | collect
# {'Bob', 'Carol'}

# her city
city_name = alice | F.lives_in | F.name | collect     # 'Berlin'

graphs are VALUES

A Graph is just another value β€” like a string or a dict. You can:

the SQL mapping, one more time

SQLZef
tableentity type (ET.Person)
rowentity node
columnrelation type (RT.name)
celltarget value node
foreign keyedge
NULLedge absent (empty set)
JOINfollow the edge

what you lose (and gain)

Lose: compile-time schema enforcement. There's no DDL.

Gain: evolution without migrations. Sparse, optional fields for free. Many-to-many without join tables. Graph traversals that read like English. Refinement types when you do want constraints.

graph-ref anatomy

When you pull an entity from a stored graph, you get a graph reference:

alice_ref = g | all(ET.Person) | first | collect
print(alice_ref)   # ET.Person('πŸ•ΈοΈ-1-...')

The πŸ•ΈοΈ- prefix means "graph-local reference." It points to a specific node in this specific graph. Chain ZefOps to traverse out from it.

putting it all together

g = Graph([
    ET.User('πŸƒ-001a...', name='Alice', age=30,
            email_={'[email protected]', '[email protected]'}),
    ET.User('πŸƒ-002b...', name='Bob',   age=25),
    ET.User('πŸƒ-003c...', name='Carol', age=42,
            email_={'[email protected]'}),
])
g.add_to_graph_store()

# Which users have zero emails?
g | all(ET.User) | filter(Fs.email | length == 0) | map(F.name) | collect
# ['Bob']

# average age of users with emails
g | all(ET.User)
  | filter(Fs.email | length > 0)
  | map(F.age) | apply({'sum': reduce(add), 'n': length})
  | apply(Z['sum'] / Z['n']) | collect
# 36.0

visualize it

For the graph above, draw the nodes and edges on a napkin. How many distinct value-nodes are there for "Alice", "Bob", "Carol"? (Hint: 3.) How many distinct email value-nodes? (Hint: 3.)

Next up: updating graphs β€” field= vs field_= vs this+… β†’