Monday, September 20, 2021
How many ways can we search through a tree? Let us count the ways.
#include <vector>
template <typename Value>
struct Tree {
Value value;
std::vector<Tree> children;
};
Visit all of a node's descendants before visiting any of its siblings.
template <typename Value>
const Tree<Value>* depth_first_search(const Tree<Value>& tree, const Value& value) {
if (tree.value == value) {
return &tree;
}
for (const auto& child : tree.children) {
if (const auto result = depth_first_search(child, value)) {
return result;
}
}
return nullptr;
}
Visit all of a node's siblings before visiting any of its descendants.
#include <queue>
template <typename Value>
const Tree<Value>* breadth_first_search(const Tree<Value>& tree, const Value& value) {
std::queue<const Tree<Value>*> to_visit;
to_visit.push(&tree);
for (;;) {
const auto current = to_visit.front();
to_visit.pop();
if (current->value == value) {
return current;
}
for (const auto& child : current->children) {
to_visit.push(&child);
}
if (to_visit.empty()) {
return nullptr;
}
}
}
This required a queue, and did not require recursion.
We could rewrite the depth-first search without recursion. After all, the call stack is a... stack.
#include <stack>
#include <vector>
template <typename Value>
const Tree<Value>* depth_first_search(const Tree<Value>& tree, const Value& value) {
using Ptr = const Tree<Value>*;
std::stack<Ptr, std::vector<Ptr>> to_visit;
to_visit.push(&tree);
for (;;) {
const auto current = to_visit.top();
to_visit.pop();
if (current->value == value) {
return current;
}
for (const auto& child : current->children) {
to_visit.push(&child);
}
if (to_visit.empty()) {
return nullptr;
}
}
}
This visits a node's children in the opposite order as before, but it's still a depth-first search.
This also is technically less efficient than the first depth-first search, because
we copy a node's children into to_visit
before visiting them.
The benefit of this second version is that it can accommodate trees that are
as deep as memory will allow, whereas the first version can only go as deep as
the call stack, which is usually much smaller. Also, a call stack frame is
larger than a pointer, so we use less space by storing only the Tree*
instead
of an entire stack frame.
The breadth-first search algorithm and the second depth-first search algorithm
are very similar. The only difference is in the definition of to_visit
.
Let's rewrite them as adaptations of the same underlying algorithm.
#include <queue>
#include <stack>
#include <vector>
template <typename Value, typename Container>
const Tree<Value>* search(const Tree<Value>& tree, const Value& value, Container&& to_visit) {
to_visit.push(&tree);
for (;;) {
const auto current = to_visit.next();
if (current->value == value) {
return current;
}
for (const auto& child : current->children) {
to_visit.push(&child);
}
if (to_visit.empty()) {
return nullptr;
}
}
}
template <typename Value>
struct Stack : public std::stack<const Tree<Value>*, std::vector<const Tree<Value>*>> {
const Tree<Value>*& next() const {
return this->top();
}
};
template <typename Value>
const Tree<Value>* depth_first_search(const Tree<Value>& tree, const Value& value) {
return search(tree, value, Stack<Value>());
}
template <typename Value>
struct Queue : public std::queue<const Tree<Value>*> {
const Tree<Value>*& next() const {
return this->front();
}
};
template <typename Value>
const Tree<Value>* breadth_first_search(const Tree<Value>& tree, const Value& value) {
return search(tree, value, Queue<Value>());
}
What other data structures could we use for to_visit
, other than a stack or a
queue?
If we don't know anything about the structure of tree
and the values within
it, then I see no reason to use any other data structure in an exhaustive
search.
Still, it's interesting to consider what happens when we use some other data structure.
What about a heap? The next element visited is always the smallest (or the largest) among those to_visit, according to some comparison function.
#include <queue>
#include <vector>
template <typename Value>
struct Greater {
bool operator(const Tree<Value>* left, const Tree<Value>* right) const {
return left->value > right->value;
}
};
template <typename Value>
struct PriorityQueue : public std::priority_queue<const Tree<Value>*, std::vector<const Tree<Value>*>, Greater<Value>> {
// Return the smallest element in this queue.
const Tree<Value>*& next() const {
return this->top();
}
};
template <typename Value>
const Tree<Value>* least_first_search(const Tree<Value>& tree, const Value& value) {
return search(tree, value, PriorityQueue<Value>());
}
This is more costly than the vanilla depth-first or breadth-first algorithms, because operations on the heap (priority queue) are logarithmic in its size rather than amortized constant.
We could also visit nodes in an expanding pseudorandom order by using a hash-based container.
#include <cstddef>
#include <functional>
#include <unordered_set>
template <typename Value>
struct HashValue {
std::size_t operator()(const Tree<Value>* node) const {
return std::hash<Value>(node->value)();
}
};
template <typename Value>
struct HashMultiset : public std::unordered_multiset<const Tree<Value>*, HashValue<Value>> {
const Tree<Value>* next() const {
return *this->begin();
}
void pop() {
erase(this->begin());
}
void push(const Tree<Value>* node) {
insert(node);
}
};
template <typename Value>
const Tree<Value>* bogo_search(const Tree<Value>& tree, const Value& value) {
return search(tree, value, HashMultiset<Value>());
}
The hash is of a node's value, which is strange. Stranger still would be to hash the node's address instead.
Treeeeeessss.......