In the previous tutorial, we learned async programming with Tokio. Now we take a deep dive into Rust collections – HashMap, BTreeMap, HashSet, BTreeSet, VecDeque, and BinaryHeap.
You already know Vec and basic HashMap. This tutorial covers the advanced features: the entry API, custom keys, range queries, set operations, sliding windows, and priority queues.
HashMap – The Entry API
The entry API is the most important HashMap feature to learn. It lets you insert or update values without looking up the key twice.
Word Count
The classic example – counting word frequencies:
use std::collections::HashMap;
fn word_count(text: &str) -> HashMap<String, usize> {
let mut counts = HashMap::new();
for word in text.split_whitespace() {
let word = word.to_lowercase();
*counts.entry(word).or_insert(0) += 1;
}
counts
}
fn main() {
let counts = word_count("the cat sat on the mat");
println!("{:?}", counts);
// {"the": 2, "cat": 1, "sat": 1, "on": 1, "mat": 1}
}
entry() returns an Entry enum. or_insert(0) returns a mutable reference to the value – either the existing one or the newly inserted 0. Then += 1 increments it.
Without the entry API, you would write:
if let Some(count) = counts.get_mut(&word) {
*count += 1;
} else {
counts.insert(word, 1);
}
The entry API does the same thing in one line.
Entry API Variants
use std::collections::HashMap;
fn main() {
let mut counts: HashMap<String, usize> = HashMap::new();
// Insert default value if key is missing
counts.entry("hello".to_string()).or_insert(0);
// Insert with a function (lazy evaluation)
let mut groups: HashMap<String, Vec<String>> = HashMap::new();
groups.entry("key".to_string()).or_insert_with(Vec::new);
// Insert with default trait
let mut map: HashMap<String, Vec<i32>> = HashMap::new();
map.entry("key".to_string()).or_default();
// Modify the value if it exists, or insert
counts.entry("hello".to_string()).and_modify(|v| *v += 1).or_insert(1);
println!("{:?}", counts); // {"hello": 1}
}
Group By
Group items by a computed key:
use std::collections::HashMap;
fn group_by_length(words: &[&str]) -> HashMap<usize, Vec<String>> {
let mut groups: HashMap<usize, Vec<String>> = HashMap::new();
for word in words {
groups
.entry(word.len())
.or_insert_with(Vec::new)
.push(word.to_string());
}
groups
}
fn main() {
let words = vec!["cat", "dog", "fish", "ant"];
let groups = group_by_length(&words);
println!("{:?}", groups);
// {3: ["cat", "dog", "ant"], 4: ["fish"]}
}
HashMap – Custom Keys
Any type can be a HashMap key if it implements Hash and Eq:
use std::collections::HashMap;
#[derive(Debug, Clone, Hash, PartialEq, Eq)]
struct Coordinate {
x: i32,
y: i32,
}
fn main() {
let mut grid = HashMap::new();
grid.insert(Coordinate { x: 0, y: 0 }, "origin".to_string());
grid.insert(Coordinate { x: 1, y: 0 }, "east".to_string());
grid.insert(Coordinate { x: 0, y: 1 }, "north".to_string());
let origin = grid.get(&Coordinate { x: 0, y: 0 });
println!("{:?}", origin); // Some("origin")
}
Important: f32 and f64 do not implement Hash or Eq because of floating-point edge cases (like NaN != NaN). If you need float keys, convert to an integer representation first.
BTreeMap – Ordered Map
BTreeMap keeps keys sorted. It has the same API as HashMap but adds range operations:
use std::collections::BTreeMap;
fn sorted_word_count(text: &str) -> BTreeMap<String, usize> {
let mut counts = BTreeMap::new();
for word in text.split_whitespace() {
*counts.entry(word.to_lowercase()).or_insert(0) += 1;
}
counts
}
fn main() {
let counts = sorted_word_count("b a c a b a");
for (word, count) in &counts {
println!("{}: {}", word, count);
}
// a: 3
// b: 2
// c: 1
}
Range Queries
BTreeMap supports efficient range queries:
use std::collections::BTreeMap;
fn main() {
let mut map = BTreeMap::new();
map.insert("apple".to_string(), 1);
map.insert("banana".to_string(), 2);
map.insert("cherry".to_string(), 3);
map.insert("date".to_string(), 4);
// Get all entries from "banana" to "date" (inclusive)
let range: Vec<_> = map
.range("banana".to_string()..="date".to_string())
.map(|(k, v)| (k.clone(), *v))
.collect();
println!("{:?}", range);
// [("banana", 2), ("cherry", 3), ("date", 4)]
}
HashMap cannot do this. It stores keys in random order.
When to Use BTreeMap vs HashMap
- HashMap – faster lookups (O(1) average). Use for most cases.
- BTreeMap – sorted keys, range queries (O(log n) lookup). Use when you need order.
HashSet – Unique Values
A HashSet is a HashMap without values. It stores unique items:
use std::collections::HashSet;
fn unique_words(text: &str) -> HashSet<String> {
text.split_whitespace()
.map(|w| w.to_lowercase())
.collect()
}
fn main() {
let unique = unique_words("hello world hello");
println!("{:?}", unique); // {"hello", "world"}
}
Set Operations
Sets support union, intersection, difference, and symmetric difference:
use std::collections::HashSet;
fn main() {
let a: HashSet<String> = ["rust", "go", "python"]
.iter().map(|s| s.to_string()).collect();
let b: HashSet<String> = ["python", "java", "rust"]
.iter().map(|s| s.to_string()).collect();
// Union: all elements from both sets
let union: HashSet<_> = a.union(&b).cloned().collect();
println!("Union: {:?}", union);
// {"rust", "go", "python", "java"}
// Intersection: elements in both sets
let intersection: HashSet<_> = a.intersection(&b).cloned().collect();
println!("Intersection: {:?}", intersection);
// {"rust", "python"}
// Difference: elements in A but not in B
let diff: HashSet<_> = a.difference(&b).cloned().collect();
println!("Difference: {:?}", diff);
// {"go"}
// Symmetric difference: elements in one set but not both
let sym_diff: HashSet<_> = a.symmetric_difference(&b).cloned().collect();
println!("Symmetric difference: {:?}", sym_diff);
// {"go", "java"}
}
Detecting Duplicates
A common pattern – check if a slice has duplicates:
use std::collections::HashSet;
fn has_duplicates<T: std::hash::Hash + Eq>(items: &[T]) -> bool {
let mut seen = HashSet::new();
for item in items {
if !seen.insert(item) {
return true;
}
}
false
}
fn main() {
println!("{}", has_duplicates(&[1, 2, 3])); // false
println!("{}", has_duplicates(&[1, 2, 1])); // true
}
HashSet::insert() returns false if the element was already there. This makes duplicate detection a one-liner.
VecDeque – Double-Ended Queue
VecDeque is like Vec but supports efficient push/pop from both ends. Vec is fast at the back but slow at the front (shifting all elements). VecDeque is fast at both ends.
use std::collections::VecDeque;
fn main() {
let mut deque = VecDeque::new();
deque.push_back(1); // [1]
deque.push_back(2); // [1, 2]
deque.push_front(0); // [0, 1, 2]
println!("{:?}", deque.pop_front()); // Some(0), deque = [1, 2]
println!("{:?}", deque.pop_back()); // Some(2), deque = [1]
}
Sliding Window
A classic use case – compute sliding window averages:
use std::collections::VecDeque;
fn sliding_window_average(numbers: &[f64], window_size: usize) -> Vec<f64> {
let mut window: VecDeque<f64> = VecDeque::new();
let mut averages = Vec::new();
for &num in numbers {
window.push_back(num);
if window.len() > window_size {
window.pop_front();
}
if window.len() == window_size {
let avg = window.iter().sum::<f64>() / window_size as f64;
averages.push(avg);
}
}
averages
}
fn main() {
let data = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let avgs = sliding_window_average(&data, 3);
println!("{:?}", avgs); // [2.0, 3.0, 4.0]
}
Recent History Buffer
VecDeque works great for bounded history:
use std::collections::VecDeque;
struct RecentHistory<T> {
items: VecDeque<T>,
capacity: usize,
}
impl<T: Clone> RecentHistory<T> {
fn new(capacity: usize) -> Self {
Self {
items: VecDeque::new(),
capacity,
}
}
fn add(&mut self, item: T) {
if self.items.len() >= self.capacity {
self.items.pop_front();
}
self.items.push_back(item);
}
fn get_all(&self) -> Vec<T> {
self.items.iter().cloned().collect()
}
}
fn main() {
let mut history = RecentHistory::new(3);
history.add("page1");
history.add("page2");
history.add("page3");
history.add("page4"); // page1 is removed
println!("{:?}", history.get_all()); // ["page2", "page3", "page4"]
}
New items go to the back. When full, the oldest item is removed from the front. This is a fixed-size ring buffer.
BinaryHeap – Priority Queue
BinaryHeap always gives you the largest element first. It is a max-heap:
use std::collections::BinaryHeap;
fn top_n_largest(numbers: &[i32], n: usize) -> Vec<i32> {
let mut heap = BinaryHeap::from(numbers.to_vec());
let mut result = Vec::new();
for _ in 0..n.min(numbers.len()) {
if let Some(val) = heap.pop() {
result.push(val);
}
}
result
}
fn main() {
let nums = vec![42, 17, 95, 3, 88];
let top2 = top_n_largest(&nums, 2);
println!("{:?}", top2); // [95, 88]
}
Custom Priority with Structs
For custom ordering, implement Ord:
use std::collections::BinaryHeap;
use std::cmp::Ordering;
#[derive(Debug, Eq, PartialEq)]
struct Task {
priority: u32,
name: String,
}
impl Ord for Task {
fn cmp(&self, other: &Self) -> Ordering {
self.priority.cmp(&other.priority)
}
}
impl PartialOrd for Task {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(other))
}
}
fn main() {
let mut heap = BinaryHeap::new();
heap.push(Task { priority: 1, name: "Low".to_string() });
heap.push(Task { priority: 3, name: "High".to_string() });
heap.push(Task { priority: 2, name: "Medium".to_string() });
while let Some(task) = heap.pop() {
println!("[{}] {}", task.priority, task.name);
}
// [3] High
// [2] Medium
// [1] Low
}
Tasks with higher priority come out first. This is useful for job schedulers, event systems, and pathfinding algorithms.
Practical Example: Frequency Analysis
Find the most frequent items in a list:
use std::collections::HashMap;
fn most_frequent(items: &[&str], n: usize) -> Vec<(String, usize)> {
let mut counts: HashMap<&str, usize> = HashMap::new();
for item in items {
*counts.entry(item).or_insert(0) += 1;
}
let mut sorted: Vec<_> = counts
.into_iter()
.map(|(k, v)| (k.to_string(), v))
.collect();
sorted.sort_by(|a, b| b.1.cmp(&a.1));
sorted.into_iter().take(n).collect()
}
fn main() {
let words = vec!["rust", "go", "rust", "python", "rust", "go"];
let freq = most_frequent(&words, 2);
println!("{:?}", freq); // [("rust", 3), ("go", 2)]
}
Practical Example: Graph as Adjacency List
HashMap is perfect for graphs:
use std::collections::HashMap;
fn build_graph(edges: &[(&str, &str)]) -> HashMap<String, Vec<String>> {
let mut graph: HashMap<String, Vec<String>> = HashMap::new();
for (from, to) in edges {
graph.entry(from.to_string()).or_default().push(to.to_string());
graph.entry(to.to_string()).or_default().push(from.to_string());
}
graph
}
fn main() {
let edges = vec![("A", "B"), ("A", "C"), ("B", "C"), ("C", "D")];
let graph = build_graph(&edges);
for (node, neighbors) in &graph {
println!("{} -> {:?}", node, neighbors);
}
}
Choosing the Right Collection
| Collection | Ordered? | Lookup | Insert | Use Case |
|---|---|---|---|---|
Vec | By index | O(1) index | O(1) back | Most sequences |
HashMap | No | O(1) avg | O(1) avg | Key-value lookups |
BTreeMap | Yes | O(log n) | O(log n) | Sorted keys, ranges |
HashSet | No | O(1) avg | O(1) avg | Unique items, membership |
BTreeSet | Yes | O(log n) | O(log n) | Sorted unique items |
VecDeque | By index | O(1) ends | O(1) ends | Queues, sliding windows |
BinaryHeap | Partial | O(1) max | O(log n) | Priority queues |
Quick rules:
- Need key-value? Use
HashMap. Need sorted keys? UseBTreeMap. - Need unique items? Use
HashSet. Need them sorted? UseBTreeSet. - Need a queue (FIFO)? Use
VecDeque. - Need “give me the biggest”? Use
BinaryHeap. - Need a regular list? Use
Vec.
Source Code
Related Articles
- Rust Tutorial #15: Async/Await and Tokio – previous tutorial
- Rust Tutorial #17: Modules and Crates – next tutorial
- Rust Tutorial Series – all tutorials
What’s Next?
We now know all the important collections in Rust. You can pick the right data structure for any problem. Next, we learn modules and crates – how to organize your Rust project into multiple files, control visibility, and use external dependencies.