Ordering an array with special characters like accents

If you just want to sort the strings as if they didn't have the accents, you could use the following:

Collections.sort(strs, new Comparator<String>() {
    @Override
    public int compare(String o1, String o2) {
        o1 = Normalizer.normalize(o1, Normalizer.Form.NFD);
        o2 = Normalizer.normalize(o2, Normalizer.Form.NFD);
        return o1.compareTo(o2);
    }
});

Related question:

  • Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars

For more sophisticated use cases you will want to read up on java.text.Collator. Here's an example:

Collections.sort(strs, new Comparator<String>() {
    @Override
    public int compare(String o1, String o2) {
        Collator usCollator = Collator.getInstance(Locale.US);
        return usCollator.compare(o1, o2);
    }
});

If none of the predefined collation rules meet your needs, you can try using the java.text.RuleBasedCollator.


You should take a look at RuleBasedCollator

RuleBasedCollator class is a concrete subclass of Collator that provides a simple, data-driven, table collator. With this class you can create a customized table-based Collator. RuleBasedCollator maps characters to sort keys.

RuleBasedCollator has the following restrictions for efficiency (other subclasses may be used for more complex languages) :

If a special collation rule controlled by a is specified it applies to the whole collator object. All non-mentioned characters are at the end of the collation order.