๐Ÿ 

crawl: graph โ†’ tree

declarative traversal. you describe what you want, zef figures out how.

the problem

You have a graph. You want a tree-shaped view of part of it โ€” maybe for a JSON response, maybe for a report. If you wrote it by hand you'd:

  1. Start at a node
  2. Follow some edges
  3. Check edge types
  4. Handle single vs multi values
  5. Detect cycles
  6. Recurse appropriately

That's 50 lines of careful code per view. crawl collapses all six steps into one declaration.

quick start

from zef import *

g = Graph([
    ET.Person('๐Ÿƒ-aaaa000000000000aaaa',
        name='Alice',
        age=30,
        friend_=[
            ET.Person('๐Ÿƒ-bbbb000000000000bbbb', name='Bob', age=25)
        ],
    ),
])
g.add_to_graph_store()

alice = g | all(ET.Person) | filter(F.name == 'Alice') | first | collect

alice | crawl({
    ET.Person: {
        'name':    F.name,
        'age':     F.age,
        'friend_': Fs.friend,
    }
}) | collect

Result:

{
  'name': 'Alice',
  'age':  30,
  'friend_': [
    {'name': 'Bob', 'age': 25, 'friend_': []}
  ]
}
mental model โ€” crawl is a schema

The argument to crawl is a dict keyed by entity type. Each entry says "when you encounter this type, here's the shape to emit." Zef walks outgoing edges matching your field rules and recurses on each linked entity.

crawl({ ET.Person: { 'name': F.name, 'friend_': Fs.friend }, โ”€โ”ฌโ”€ โ”€โ”ฌโ”€ โ”€โ”ฌโ”€ โ”‚ โ”‚ โ”‚ single field โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ field containing another entity โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ becomes a multi-key in output โ”€โ”€โ”˜ and triggers recursion })

F vs Fs in crawl

Same rules as before. F.x = single, Fs.x = set.

rulemeaning
'name': F.namesingle value, error if 0 or >1
'emails_': Fs.emailset of values (empty if none)
'friend_': Fs.friendtraverses into other entities, recurses

multiple types in one crawl

You can describe rules for each type your crawl will touch:

forest | crawl({
    ET.Habitat: {
        'name':       F.name,
        'climate':    F.climate,
        'residents_': Fs.resident_animal,
    },
    ET.Animal: {
        'name':    F.name,
        'species': F.species,
        'diet_':   Fs.eats,
    },
    ET.Plant: {
        'name':    F.name,
        'edible?': F.edible,
    }
}) | collect

When the crawler hits a Habitat, it applies the Habitat rules. When the crawler hits an Animal, it applies Animal rules. Entities of types not in the dict are handled as bare refs (no further recursion) โ€” the natural way to stop a traversal.

cycle detection (automatic)

Graphs have cycles. Alice's friend is Bob. Bob's friend is Alice. Naive recursion explodes. crawl handles this by:

alice.friend_=[bob]
bob.friend_=[alice]

alice | crawl({ET.Person: {'name':F.name, 'friend_':Fs.friend}}) | collect
# {'name':'Alice', 'friend_':[
#   {'name':'Bob', 'friend_':[ET.Person('๐Ÿƒ-alice-uid')]}
# ]}
#                                         ^ cycle broken with a ref

the ellipsis trick โ€” all fields

Don't want to enumerate every field? Use ... to say "all of them":

alice | crawl({
    ET.Person: ...,
}) | collect
# emits all fields that exist on each Person, with default F / Fs semantics
# based on field name (trailing _ = multi)

Mix specific rules + ellipsis:

alice | crawl({
    ET.Person: {
        'friend_': Fs.friend | take(5),    # explicitly cap friends at 5
        ...:       ...,                      # everything else default
    }
}) | collect

transforms in rules

Rules are ZefOps, so you can transform on the way out:

alice | crawl({
    ET.Person: {
        'name':        F.name | to_upper_case,
        'age_years':   F.age,
        'age_months':  F.age | multiply(12),
        'shouty':      apply(lambda p: f"{p.name}!!!"),
    }
}) | collect

a realistic example: a blog API response

g = Graph([
    ET.Post('๐Ÿƒ-p1',
        title='Hello Zef',
        body='...',
        author=ET.User('๐Ÿƒ-u1', name='Alice'),
        tag_={'intro', 'zef'},
        comment_=[
            ET.Comment(body='great!', by=ET.User('๐Ÿƒ-u2', name='Bob')),
            ET.Comment(body='lol',    by=ET.User('๐Ÿƒ-u3', name='Carol')),
        ],
    ),
])
g.add_to_graph_store()

api_shape = {
    ET.Post: {
        'title':    F.title,
        'body':     F.body,
        'author':   F.author,
        'tags_':    Fs.tag,
        'comments_': Fs.comment,
    },
    ET.User: {
        'name': F.name,
    },
    ET.Comment: {
        'body': F.body,
        'by':   F.by,
    },
}

post = g | all(ET.Post) | first | collect
response = post | crawl(api_shape) | collect
# ready to JSON-encode and send
resp_json = response | to_json | collect

one line moves your data

That post | crawl(api_shape) is doing the work of an ORM's eager-loading mechanism + a serializer + a view template. All as one declarative value.

You can save api_shape to a file, version it, compare it, A/B test two shapes, store it in a DB. It's just data.

transitive traversals

Sometimes you want "follow these edges as far as they go." Use ZefOp composition inside the rule:

# "everyone within 2 hops of Alice"
friends_of_friends = Fs.friend | Fs.friend

alice | crawl({
    ET.Person: {
        'name': F.name,
        'extended_network_': friends_of_friends,
    }
}) | collect

when NOT to use crawl

exercise

Given a graph of ET.Company entities with employee_ fields pointing at ET.Person entities, write a crawl rule to produce:

{'company': 'Acme', 'headcount': 42, 'ceo': 'Alice'}
solution
company | crawl({
    ET.Company: {
        'company':   F.name,
        'headcount': Fs.employee | length,
        'ceo':       F.ceo | F.name,
    }
}) | collect

Next up: FX โ€” effects as first-class data. โ†’