How to sort a Vec of structs by a String field?

The sort_by_key function takes ownership of the key:

pub fn sort_by_key<K, F>(&mut self, f: F)

This is why you are getting error E0507.

A simple fix will be to store a reference on your struct so sort_by_key will not take the ownership of your key.

Then you need to had a lifetime to the referred value so she can be dropped when your struct is gone.

struct Dummy<'a> {
    x: &'a str,
    y: i8,
}

fn main() {
    let mut dummies: Vec<Dummy> = Vec::new();
    dummies.push(Dummy { x: "a", y: 1 });
    dummies.push(Dummy { x: "b", y: 2 });

    dummies.sort_by_key(|d| d.x);
    dummies.sort_by_key(|d| d.y);
}

First, let's look at your original error message, then we'll go through a few fixes and try to understand everything.

In the closure that you use in dummies.sort_by_key(|d| d.x);, d is a reference to a Dummy instance. However, the field access d.x is the String itself. If you wanted to return that String, you'd have to give ownership of it to whatever called the closure. But since d was just a reference, you can't pass ownership of its data.

One easy fix is to simply clone the string as dummies.sort_by_key(|d| d.x.clone());. This makes a copy of the string before returning it in the closure (this is Andra's solution). This works perfectly, but if performance or memory use is an issue, we can avoid the clone.

The idea here is that using the string as the key is wasteful. Really, all we need to know is which of two strings is smaller. If we use the string as a key, then every time the sort function needs to compare two Dummys, it calls the key function on each one and the strings are passed to a (very short) function that simply compares them. If we did the comparison in the same context as the borrow, we'd be able to simply pass the result of the comparison on, rather than the strings.

The solution is the sort_by method on slices. This allows us to take references to two Dummys and decide if one is smaller than the other. For example we can use it like dummies.sort_by(|d1, d2| d1.x.cmp(&d2.x)); (full example here)

Addendum

Why can't we use sort_by_key without cloning the Strings? Surely there must be some clever way of using string slices and lifetimes to do it.

Let's look at the signature of the sort_by_key function.

pub fn sort_by_key<K, F>(&mut self, f: F) where
    F: FnMut(&T) -> K,
    K: Ord, 

The interesting part of this function is not what is there, but what isn't there. The type parameter K doesn't depend on the lifetime of the reference passed to f.

As the slice is sorted, the key function gets repeatedly called with a reference to a Dummy instance. Since the slice is sorted a little between each call, the lifetime of the reference must be very short. If it were longer, it'd get invalidated the next time the elements of the slice were moved around. However, K can't depend on that lifetime. That means that whatever our key function is, it can't return anything that depends on the current location of the Dummy (e.g. a string slice, a reference, or any other clever construction1).

However, we could make K depend on the lifetime of whatever is passed to it. The idea here is what's called Higher-Rank Trait Bounds. These currently only work with lifetimes (though in theory they could be extended to all type parameters). We could posit another slice method with signature

fn sort_by_key_hrtb<T, F, K>(slice: &mut [T], f: F)
where
    F: Fn(&T) -> &K,
    K: Ord,

Why does this make things work? In F: Fn(&T) -> &K,, the lifetime of the output reference is implicitly the same as (or longer than) the lifetime of the input reference. Desugared, this is F: for<'a> Fn(&'a T) -> &'a K,, which says that f should be able to take a reference with any lifetime 'a and return a reference with lifetime (greater than or equal to) 'a. Now we have a method that works exactly how you wanted to use it (except for a pesky &2). (playground link)


  1. Actually, there is one (unsafe) clever construction that probably works, but I haven't vetted it. You can use a wrapper around a raw pointer to a String and then impl Ord for that wrapper so that it dereferences the pointer to do the comparison.3 The return type for the key function would be *const String, so we don't need any lifetimes. This is inherently unsafe though, and I definitely wouldn't recommend it. A (probably) working example is here.

  2. The only reason we need to use &mut dummies here is that sort_by_key_hrtb isn't actually a slice method. If it were, dummies would be automatically borrowed and dereferenced into a slice, so we could call the function like dummies.sort_by_key_hrtb(|d| &d.x);.

  3. Why a wrapper instead of just a pointer? *const T implements Ord, but it does so by comparing the addresses rather than the underlying value (if any), which isn't what we want here.

Tags:

Rust