Finding where a specific block is used across a Wagtail site can be challenging, especially when those blocks are deeply nested inside other blocks. For developers managing large content structures, this can make updates and maintenance time-consuming and difficult.
The provided module helps solve this problem by recursively searching for a particular block based on given criteria. It efficiently handles blocks nested inside StructBlock
, StreamBlock
, or even ListBlock
, making it possible to identify every page where a given block type is used.
The code is available on GitHub as well as at the end of this article.
The find_block_in_pages
function takes one or more criteria
functions that are used to filter blocks. Each criteria
function receives a bound_block
and should return True
if the block meets the criteria, and False
otherwise. If all criteria
functions return True
, the block is yielded.
The find_block_in_pages
function also takes an optional qs
argument that can be used to limit the pages to search for blocks. This allows you to scope the search efficiently if you know the area of the site you want to focus on.
For every block found, find_block_in_pages
yields a tuple containing:
page
: The current page in which the block is found.field
: The field in the page where the block resides.index
: The index position of the block (useful for repeatable blocks).parent_bound_block
: The parent block of the found block.bound_block
: The block itself that matches the criteria.For the following examples, we will use the Bakery Demo project. To get started, add the wagtail_find_block.py module to your project.
This example will find all instances of ImageBlock
in the Bakery Demo website and print the caption of each block:
>>> from wagtail_find_block import find_block_in_pages, block_instance_of
>>> from bakerydemo.base.blocks import ImageBlock
>>> results = find_block_in_pages(block_instance_of(ImageBlock))
>>> for page, field, index, parent_bound_block, bound_block in results:
... print(f'Found ImageBlock in "{page}" ({field.verbose_name}): {bound_block.value["caption"]}')
Found ImageBlock in "Tracking Wild Yeast" (Page body): Raised Yummy
Found ImageBlock in "Bread and Circuses" (Page body): Soda Bread
Found ImageBlock in "Bread and Circuses" (Page body):
Found ImageBlock in "The Great Icelandic Baking Show" (Page body): Baking Soda
Found ImageBlock in "The Great Icelandic Baking Show" (Page body): Fresh baked
Found ImageBlock in "The Joy of (Baking) Soda" (Page body): Fresh baked
Found ImageBlock in "The Joy of (Baking) Soda" (Page body): Cornucopia of Breads
Found ImageBlock in "The Greatest Thing Since Sliced Bread" (Page body): Belgian Waffle
Found ImageBlock in "Desserts with Benefits" (Page body): Central Bakery
Similar to the previous example, you can also find instances of ImageBlock
in the Bakery Demo website where the caption contains the word "Fresh":
>>> results = find_block_in_pages(block_instance_of(ImageBlock), lambda block: "Fresh" in block.value["caption"])
>>> page, field, index, parent_bound_block, bound_block = next(results)
>>> bound_block.value["caption"]
'Fresh baked'
Another powerful use case is finding and updating blocks. In this example, we will find all instances of ImageBlock
and append "(c) 2024." to the caption of each block, saving the updated pages:
>>> results = find_block_in_pages(block_instance_of(ImageBlock))
>>> updated_pages = set()
>>> for page, field, index, parent_bound_block, bound_block in results:
... bound_block.value["caption"] += " (c) 2024."
... updated_pages.add(page)
>>> for page in updated_pages:
... page.save()
One of the most powerful features of find_block_in_pages
is its flexibility to work with custom criteria. Instead of just looking for a specific block type, you can pass additional conditions to narrow down the search to exactly what you need.
For example, you might want to find all RichTextBlock
instances that contain the word "recipe" in their content:
>>> from wagtail.blocks import RichTextBlock
>>> results = find_block_in_pages(block_instance_of(RichTextBlock), lambda block: "recipe" in block.value.source)
>>> for page, field, index, parent_bound_block, bound_block in results:
... print(f'Found a recipe reference in "{page}" ({field.verbose_name}): {bound_block.value}')
qs
for Targeted SearchesThe qs
argument is incredibly useful if you want to perform targeted searches. For example, you might want to search only specific page types or pages under a particular section:
>>> from wagtail.models import Page
>>> from bakerydemo.base.blocks import ImageBlock
>>> from wagtail_find_block import find_block_in_pages, block_instance_of
>>> qs = Page.objects.get(title="Blog").get_descendants().live()
>>> results = find_block_in_pages(block_instance_of(ImageBlock), qs=qs)
>>> for page, field, index, parent_bound_block, bound_block in results:
... print(f'Found ImageBlock in "{page}" ({field.verbose_name}): {bound_block.value["caption"]}')
The find_block_in_pages
function is a powerful utility for any Wagtail developer who needs to locate or update specific block types throughout a site. Whether for maintenance, migrations, or content audits, this tool can save a lot of time and effort.
The flexible criteria-based filtering, combined with the ability to traverse complex nested block structures, makes it an invaluable tool for developers working on large Wagtail projects.
You can find the full code for the wagtail_find_block.py
module on GitHub or copy it from below to start using it in your own projects.
# wagtail_find_block.py
from wagtail.blocks import StreamValue, StructValue
from wagtail.blocks.list_block import ListValue
from wagtail.models import Page
from wagtail.fields import StreamField
def find_block(bound_block, *criterias, parent_bound_block=None):
if all(criteria(bound_block) for criteria in criterias):
yield (parent_bound_block, bound_block)
else:
value = bound_block.value
if isinstance(value, StreamValue):
for child in value:
yield from find_block(child, *criterias, parent_bound_block=bound_block)
elif isinstance(value, ListValue):
for child in value.bound_blocks:
yield from find_block(child, *criterias, parent_bound_block=bound_block)
elif isinstance(value, StructValue):
for child in value.bound_blocks.values():
yield from find_block(child, *criterias, parent_bound_block=bound_block)
def find_block_in_page(page, *criterias):
for field in page._meta.get_fields():
if isinstance(field, StreamField):
for index, block in enumerate(getattr(page, field.name)):
for parent_bound_block, bound_block in find_block(block, *criterias):
yield (field, index, parent_bound_block, bound_block)
def find_block_in_pages(*criterias, qs=None):
if qs is None:
qs = Page.objects.all()
for page in qs.specific():
for field, index, parent_bound_block, bound_block in find_block_in_page(
page, *criterias
):
yield (page, field, index, parent_bound_block, bound_block)
def block_instance_of(block_class):
def criteria(bound_block):
return isinstance(bound_block.block, block_class)
return criteria