How to sort Alphanumeric String

Your solution lies in The Alphanum Algorithm and you can implement like this


Here is a self-contained example on how to do this (not particularly optimized):

final Pattern p = Pattern.compile("^\\d+");
String[] examples = { 
   "1some", "2some", "20some", "21some", "3some", "some", "1abc", "abc"
};
Comparator<String> c = new Comparator<String>() {
    @Override
    public int compare(String object1, String object2) {
        Matcher m = p.matcher(object1);
        Integer number1 = null;
        if (!m.find()) {
            return object1.compareTo(object2);
        }
        else {
            Integer number2 = null;
            number1 = Integer.parseInt(m.group());
            m = p.matcher(object2);
            if (!m.find()) {
                return object1.compareTo(object2);
            }
            else {
                number2 = Integer.parseInt(m.group());
                int comparison = number1.compareTo(number2);
                if (comparison != 0) {
                    return comparison;
                }
                else {
                    return object1.compareTo(object2);
                }
            }
        }
    }
};
List<String> examplesList = new ArrayList<String>(Arrays.asList(examples));
Collections.sort(examplesList, c);
System.out.println(examplesList);

Output

[1abc, 1some, 2some, 3some, 20some, 21some, abc, some]

Explanation

  • The example uses a constant Pattern to infer whether a number is in the String's starting position.
  • If not present in the first String, it compares it as is to the second.
  • If present indeed in the first, it checks the second.
  • If not present in the second, it compares the two Strings as is, again
  • If present in both, it compares the Integers instead of the whole Strings, hence resulting in a numerical comparison rather than a lexicographical one
  • If the number compare identical, it goes back to lexicographic comparison of the whole Strings (thanks MihaiC for spotting this one)

First make an alphanumerical comparator splitting the string in String or Integer parts.

public class AlphaNumericalComparator implements Comparator<String> {
    @Override
    public int compare(String o1, String o2) {
        List<Object> parts1 = partsOf(o1);
        List<Object> parts2 = partsOf(o2);
        while (!parts1.isEmpty() && !parts2.isEmpty()) {
            Object part1 = parts1.remove(0);
            Object part2 = parts2.remove(0);
            int cmp = 0;
            if (part1 instanceof Integer && part2 instanceof Integer) {
                cmp = Integer.compare((Integer)part1, (Integer)part2);
            } else if (part1 instanceof String && part2 instanceof String) {
                cmp = ((String) part1).compareTo((String) part2);
            } else {
                cmp = part1 instanceof String ? 1 : -1; // XXXa > XXX1
            }
            if (cmp != 0) {
                return cmp;
            }
        }
        if (parts1.isEmpty() && parts2.isEmpty()) {
            return 0;
        }
        return parts1.isEmpty() ? -1 : 1;
    }

    private List<Object> partsOf(String s) {
        List<Object> parts = new LinkedList<>();
        int pos0 = 0;
        int pos = 0;
        boolean wasDigit = false;
        while (true) {
            if (pos >= s.length()
                    || Character.isDigit(s.charAt(pos)) != wasDigit) {
                if (pos > pos0) {
                    String part = s.substring(pos0, pos);
                    parts.add(wasDigit? Integer.valueOf(part) : part);
                    pos0 = pos;
                }
                if (pos >= s.length()) {
                    break;
                }
                wasDigit = !wasDigit;
            }
            ++pos;
        }
        return parts;
    }
};

Then use this comparator in your own one, in Java 8 you may simply use Comparator's static methods.

Tags:

Java

Sorting