In the previous tutorial, we learned async programming with Tokio. Now we take a deep dive into Rust collections – HashMap, BTreeMap, HashSet, BTreeSet, VecDeque, and BinaryHeap.

You already know Vec and basic HashMap. This tutorial covers the advanced features: the entry API, custom keys, range queries, set operations, sliding windows, and priority queues.

HashMap – The Entry API

The entry API is the most important HashMap feature to learn. It lets you insert or update values without looking up the key twice.

Word Count

The classic example – counting word frequencies:

use std::collections::HashMap;

fn word_count(text: &str) -> HashMap<String, usize> {
    let mut counts = HashMap::new();
    for word in text.split_whitespace() {
        let word = word.to_lowercase();
        *counts.entry(word).or_insert(0) += 1;
    }
    counts
}

fn main() {
    let counts = word_count("the cat sat on the mat");
    println!("{:?}", counts);
    // {"the": 2, "cat": 1, "sat": 1, "on": 1, "mat": 1}
}

entry() returns an Entry enum. or_insert(0) returns a mutable reference to the value – either the existing one or the newly inserted 0. Then += 1 increments it.

Without the entry API, you would write:

if let Some(count) = counts.get_mut(&word) {
    *count += 1;
} else {
    counts.insert(word, 1);
}

The entry API does the same thing in one line.

Entry API Variants

use std::collections::HashMap;

fn main() {
    let mut counts: HashMap<String, usize> = HashMap::new();

    // Insert default value if key is missing
    counts.entry("hello".to_string()).or_insert(0);

    // Insert with a function (lazy evaluation)
    let mut groups: HashMap<String, Vec<String>> = HashMap::new();
    groups.entry("key".to_string()).or_insert_with(Vec::new);

    // Insert with default trait
    let mut map: HashMap<String, Vec<i32>> = HashMap::new();
    map.entry("key".to_string()).or_default();

    // Modify the value if it exists, or insert
    counts.entry("hello".to_string()).and_modify(|v| *v += 1).or_insert(1);

    println!("{:?}", counts);  // {"hello": 1}
}

Group By

Group items by a computed key:

use std::collections::HashMap;

fn group_by_length(words: &[&str]) -> HashMap<usize, Vec<String>> {
    let mut groups: HashMap<usize, Vec<String>> = HashMap::new();
    for word in words {
        groups
            .entry(word.len())
            .or_insert_with(Vec::new)
            .push(word.to_string());
    }
    groups
}

fn main() {
    let words = vec!["cat", "dog", "fish", "ant"];
    let groups = group_by_length(&words);
    println!("{:?}", groups);
    // {3: ["cat", "dog", "ant"], 4: ["fish"]}
}

HashMap – Custom Keys

Any type can be a HashMap key if it implements Hash and Eq:

use std::collections::HashMap;

#[derive(Debug, Clone, Hash, PartialEq, Eq)]
struct Coordinate {
    x: i32,
    y: i32,
}

fn main() {
    let mut grid = HashMap::new();
    grid.insert(Coordinate { x: 0, y: 0 }, "origin".to_string());
    grid.insert(Coordinate { x: 1, y: 0 }, "east".to_string());
    grid.insert(Coordinate { x: 0, y: 1 }, "north".to_string());

    let origin = grid.get(&Coordinate { x: 0, y: 0 });
    println!("{:?}", origin);  // Some("origin")
}

Important: f32 and f64 do not implement Hash or Eq because of floating-point edge cases (like NaN != NaN). If you need float keys, convert to an integer representation first.

BTreeMap – Ordered Map

BTreeMap keeps keys sorted. It has the same API as HashMap but adds range operations:

use std::collections::BTreeMap;

fn sorted_word_count(text: &str) -> BTreeMap<String, usize> {
    let mut counts = BTreeMap::new();
    for word in text.split_whitespace() {
        *counts.entry(word.to_lowercase()).or_insert(0) += 1;
    }
    counts
}

fn main() {
    let counts = sorted_word_count("b a c a b a");
    for (word, count) in &counts {
        println!("{}: {}", word, count);
    }
    // a: 3
    // b: 2
    // c: 1
}

Range Queries

BTreeMap supports efficient range queries:

use std::collections::BTreeMap;

fn main() {
    let mut map = BTreeMap::new();
    map.insert("apple".to_string(), 1);
    map.insert("banana".to_string(), 2);
    map.insert("cherry".to_string(), 3);
    map.insert("date".to_string(), 4);

    // Get all entries from "banana" to "date" (inclusive)
    let range: Vec<_> = map
        .range("banana".to_string()..="date".to_string())
        .map(|(k, v)| (k.clone(), *v))
        .collect();

    println!("{:?}", range);
    // [("banana", 2), ("cherry", 3), ("date", 4)]
}

HashMap cannot do this. It stores keys in random order.

When to Use BTreeMap vs HashMap

  • HashMap – faster lookups (O(1) average). Use for most cases.
  • BTreeMap – sorted keys, range queries (O(log n) lookup). Use when you need order.

HashSet – Unique Values

A HashSet is a HashMap without values. It stores unique items:

use std::collections::HashSet;

fn unique_words(text: &str) -> HashSet<String> {
    text.split_whitespace()
        .map(|w| w.to_lowercase())
        .collect()
}

fn main() {
    let unique = unique_words("hello world hello");
    println!("{:?}", unique);  // {"hello", "world"}
}

Set Operations

Sets support union, intersection, difference, and symmetric difference:

use std::collections::HashSet;

fn main() {
    let a: HashSet<String> = ["rust", "go", "python"]
        .iter().map(|s| s.to_string()).collect();
    let b: HashSet<String> = ["python", "java", "rust"]
        .iter().map(|s| s.to_string()).collect();

    // Union: all elements from both sets
    let union: HashSet<_> = a.union(&b).cloned().collect();
    println!("Union: {:?}", union);
    // {"rust", "go", "python", "java"}

    // Intersection: elements in both sets
    let intersection: HashSet<_> = a.intersection(&b).cloned().collect();
    println!("Intersection: {:?}", intersection);
    // {"rust", "python"}

    // Difference: elements in A but not in B
    let diff: HashSet<_> = a.difference(&b).cloned().collect();
    println!("Difference: {:?}", diff);
    // {"go"}

    // Symmetric difference: elements in one set but not both
    let sym_diff: HashSet<_> = a.symmetric_difference(&b).cloned().collect();
    println!("Symmetric difference: {:?}", sym_diff);
    // {"go", "java"}
}

Detecting Duplicates

A common pattern – check if a slice has duplicates:

use std::collections::HashSet;

fn has_duplicates<T: std::hash::Hash + Eq>(items: &[T]) -> bool {
    let mut seen = HashSet::new();
    for item in items {
        if !seen.insert(item) {
            return true;
        }
    }
    false
}

fn main() {
    println!("{}", has_duplicates(&[1, 2, 3]));    // false
    println!("{}", has_duplicates(&[1, 2, 1]));    // true
}

HashSet::insert() returns false if the element was already there. This makes duplicate detection a one-liner.

VecDeque – Double-Ended Queue

VecDeque is like Vec but supports efficient push/pop from both ends. Vec is fast at the back but slow at the front (shifting all elements). VecDeque is fast at both ends.

use std::collections::VecDeque;

fn main() {
    let mut deque = VecDeque::new();
    deque.push_back(1);    // [1]
    deque.push_back(2);    // [1, 2]
    deque.push_front(0);   // [0, 1, 2]

    println!("{:?}", deque.pop_front());  // Some(0), deque = [1, 2]
    println!("{:?}", deque.pop_back());   // Some(2), deque = [1]
}

Sliding Window

A classic use case – compute sliding window averages:

use std::collections::VecDeque;

fn sliding_window_average(numbers: &[f64], window_size: usize) -> Vec<f64> {
    let mut window: VecDeque<f64> = VecDeque::new();
    let mut averages = Vec::new();

    for &num in numbers {
        window.push_back(num);
        if window.len() > window_size {
            window.pop_front();
        }
        if window.len() == window_size {
            let avg = window.iter().sum::<f64>() / window_size as f64;
            averages.push(avg);
        }
    }

    averages
}

fn main() {
    let data = vec![1.0, 2.0, 3.0, 4.0, 5.0];
    let avgs = sliding_window_average(&data, 3);
    println!("{:?}", avgs);  // [2.0, 3.0, 4.0]
}

Recent History Buffer

VecDeque works great for bounded history:

use std::collections::VecDeque;

struct RecentHistory<T> {
    items: VecDeque<T>,
    capacity: usize,
}

impl<T: Clone> RecentHistory<T> {
    fn new(capacity: usize) -> Self {
        Self {
            items: VecDeque::new(),
            capacity,
        }
    }

    fn add(&mut self, item: T) {
        if self.items.len() >= self.capacity {
            self.items.pop_front();
        }
        self.items.push_back(item);
    }

    fn get_all(&self) -> Vec<T> {
        self.items.iter().cloned().collect()
    }
}

fn main() {
    let mut history = RecentHistory::new(3);
    history.add("page1");
    history.add("page2");
    history.add("page3");
    history.add("page4");  // page1 is removed

    println!("{:?}", history.get_all());  // ["page2", "page3", "page4"]
}

New items go to the back. When full, the oldest item is removed from the front. This is a fixed-size ring buffer.

BinaryHeap – Priority Queue

BinaryHeap always gives you the largest element first. It is a max-heap:

use std::collections::BinaryHeap;

fn top_n_largest(numbers: &[i32], n: usize) -> Vec<i32> {
    let mut heap = BinaryHeap::from(numbers.to_vec());
    let mut result = Vec::new();
    for _ in 0..n.min(numbers.len()) {
        if let Some(val) = heap.pop() {
            result.push(val);
        }
    }
    result
}

fn main() {
    let nums = vec![42, 17, 95, 3, 88];
    let top2 = top_n_largest(&nums, 2);
    println!("{:?}", top2);  // [95, 88]
}

Custom Priority with Structs

For custom ordering, implement Ord:

use std::collections::BinaryHeap;
use std::cmp::Ordering;

#[derive(Debug, Eq, PartialEq)]
struct Task {
    priority: u32,
    name: String,
}

impl Ord for Task {
    fn cmp(&self, other: &Self) -> Ordering {
        self.priority.cmp(&other.priority)
    }
}

impl PartialOrd for Task {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

fn main() {
    let mut heap = BinaryHeap::new();
    heap.push(Task { priority: 1, name: "Low".to_string() });
    heap.push(Task { priority: 3, name: "High".to_string() });
    heap.push(Task { priority: 2, name: "Medium".to_string() });

    while let Some(task) = heap.pop() {
        println!("[{}] {}", task.priority, task.name);
    }
    // [3] High
    // [2] Medium
    // [1] Low
}

Tasks with higher priority come out first. This is useful for job schedulers, event systems, and pathfinding algorithms.

Practical Example: Frequency Analysis

Find the most frequent items in a list:

use std::collections::HashMap;

fn most_frequent(items: &[&str], n: usize) -> Vec<(String, usize)> {
    let mut counts: HashMap<&str, usize> = HashMap::new();
    for item in items {
        *counts.entry(item).or_insert(0) += 1;
    }

    let mut sorted: Vec<_> = counts
        .into_iter()
        .map(|(k, v)| (k.to_string(), v))
        .collect();
    sorted.sort_by(|a, b| b.1.cmp(&a.1));
    sorted.into_iter().take(n).collect()
}

fn main() {
    let words = vec!["rust", "go", "rust", "python", "rust", "go"];
    let freq = most_frequent(&words, 2);
    println!("{:?}", freq);  // [("rust", 3), ("go", 2)]
}

Practical Example: Graph as Adjacency List

HashMap is perfect for graphs:

use std::collections::HashMap;

fn build_graph(edges: &[(&str, &str)]) -> HashMap<String, Vec<String>> {
    let mut graph: HashMap<String, Vec<String>> = HashMap::new();
    for (from, to) in edges {
        graph.entry(from.to_string()).or_default().push(to.to_string());
        graph.entry(to.to_string()).or_default().push(from.to_string());
    }
    graph
}

fn main() {
    let edges = vec![("A", "B"), ("A", "C"), ("B", "C"), ("C", "D")];
    let graph = build_graph(&edges);

    for (node, neighbors) in &graph {
        println!("{} -> {:?}", node, neighbors);
    }
}

Choosing the Right Collection

CollectionOrdered?LookupInsertUse Case
VecBy indexO(1) indexO(1) backMost sequences
HashMapNoO(1) avgO(1) avgKey-value lookups
BTreeMapYesO(log n)O(log n)Sorted keys, ranges
HashSetNoO(1) avgO(1) avgUnique items, membership
BTreeSetYesO(log n)O(log n)Sorted unique items
VecDequeBy indexO(1) endsO(1) endsQueues, sliding windows
BinaryHeapPartialO(1) maxO(log n)Priority queues

Quick rules:

  • Need key-value? Use HashMap. Need sorted keys? Use BTreeMap.
  • Need unique items? Use HashSet. Need them sorted? Use BTreeSet.
  • Need a queue (FIFO)? Use VecDeque.
  • Need “give me the biggest”? Use BinaryHeap.
  • Need a regular list? Use Vec.

Source Code

View source code on GitHub ->

What’s Next?

We now know all the important collections in Rust. You can pick the right data structure for any problem. Next, we learn modules and crates – how to organize your Rust project into multiple files, control visibility, and use external dependencies.

Next: Rust Tutorial #17: Modules and Crates