Maintain an ArrayList of unique arrays in java

If we assume that your arrays cannot contain several time the same integer (like [1, 1, 2]) then your definition of uniqueness (having the same elements regardless of the order) for your array is the one of a Set, so you could use a Set of Set.

public static void main(String[] args){
    Set<Set<Integer>> result = new HashSet<>();
    int [] x = {1,2,3};
    int [] z = {2,1,3};
    int [] m = {2,1,3};

    result.add(arrayToSet(x));
    result.add(arrayToSet(z));
    result.add(arrayToSet(m));

    System.out.println(result);

}

private static Set<Integer> arrayToSet(int [] arr){
    return Arrays.stream(arr).boxed().collect(Collectors.toSet());
}

If you want to keep your arrays then which one should be kept when two arrays have the same elements? If it's the first one that has been added, you could use a Map<Set<Integer>, int[]> and then the values of your map contains the arrays.


If you need to consider that it can contain several time the same integer, then those are Multisets. You can implement a Multiset by a Map<Integer, Integer> which counts how many time each element is present. Then you can use the same implementation but with a Set<Map<Integer, Integer>> instead of a Set<Integer> :

public static void main(String[] args){
    Set<Map<Integer,Long>> result = new HashSet<>();
    int [] x = {1,2,3};
    int [] z = {1,2,2,3};
    int [] m = {1,2,3,2};

    result.add(arrayToMultiSet(x));
    result.add(arrayToMultiSet(z));
    result.add(arrayToMultiSet(m));

    System.out.println(result);

}

private static Map<Integer,Long> arrayToMultiSet(int [] arr){
    return Arrays.stream(arr).boxed().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
}

Note : I used Map<Integer,Long> because Collectors.counting() returns a collector of Long.


You can create a method to add if not equals like so :

public static Set<int[]> addIfNotExist(Set<int[]> result, int[] array) {
    Arrays.sort(array);
    boolean check = result.stream()
            .anyMatch(a -> {
                Arrays.sort(a);
                return Arrays.equals(a, array);
            });
    if (check) {
        return result;
    } else {
        result.add(array);
        return result;
    }
}

Then you can call your method like so :

result = addIfNotExist(result, x);
result = addIfNotExist(result, z);
result = addIfNotExist(result, m);

Output

[1, 2, 3]

Or if you use a static Set, you can just use :

static Set<int[]> result = new LinkedHashSet<>();

public static void main(String[] args) {

    int[] x = {1, 2, 3};
    int[] z = {2, 1, 3};
    int[] m = {2, 1, 3};

    addIfNotExist(result, x);
    addIfNotExist(result, z);
    addIfNotExist(result, m);

    for (int[] arr : result) {
        System.out.println(Arrays.toString(arr));
    }
}

public static void addIfNotExist(Set<int[]> result, int[] array) {
    Arrays.sort(array);
    boolean check = result.stream()
            .anyMatch(a -> {
                Arrays.sort(a);
                return Arrays.equals(a, array);
            });
    if (!check) {
        result.add(array);
    }
}

There already have been some answers, but some of them make certain assumptions that can not be derived from the question, others suggest certain changes that may be considered as workarounds, and others have certain drawbacks that should not be ignored.

Addressing some of them here:

  • Changing the elements to a Set<Integer> makes the assumption that each element may only appear once. Additionally, if the existing code already creates the int[] arrays, and the downstream code needs the int[] arrays, then using the resulting data structure would be clumsy:

    int array[] = somewhere.getArray();
    Set<Integer> set = convert(array); // Convert array to set
    data.add(set);
    ...
    set = data.iterator().next();
    array = convert(set);              // Convert set to array
    somewhere.setArray(array);
    

    Depending on the size of the arrays, this may have an impact on performance and generate some memory overhad.

  • Using a TreeSet<int[]> looks like a simple and reasonable solution. The nicest thing is that it can directly store the int[] arrays. But it has some drawbacks:

    1. It implies an ordering. It is no longer possible to use another Set implementation (like a LinkedHashSet) that retains the insertion order
    2. It may be a bit tricky to implement the comparison in a way that is consistent with equals, and failing to do so will cause the set to no longer obey the general contract of the Set interface
    3. A simple but correct implementation of the comparison will likely involve sorting the arrays. This means that the arrays might either have to be modified by their insertion into the set, which is certainly not acceptable, or one would have to create defensive copies of the arrays. Here, one has to keep in mind that the copy and the sorting will have to be done each and every time when a new array is added, and it will have to be done multiple times while going down the tree. Although the number of comparisons will only be O(log(n)) for a set with n elements, sorting will take O(m log(m)) each time for arrays of length m, which may be prohibitively expensive.
  • Similar arguments may be applied for the approaches that check whether an "equivalent" array already exists before inserting a new one. Additionally, these approaches are not based on a data structure, but have to be part of the code that uses the data structure.

For these reasons, I'd go for an approach that is basically the same as the one that Mykhailo Moskura mentioned in his answer : It is based on a class that simply wraps the given int[] arrays, and implements equals and hashCode accordingly.

(Note that you could also let this class implement Comparable, adding some flexibility about whether it can be put into a TreeSet, if you are aware of the possible difficulties and drawbacks mentioned above...)

In the example below, this class is called UnorderedIntArray. Conceptually, it would be preferable to have a Set<int[]>, and the solution below has to use a Set<UnorderedIntArray>. But since this class only wraps an existing array, the performance- and memory overhead are practically zero, and I therefore still consider it as advantageous compared to converting between the int[] and some other data type. Also note that the equals method in the example below is not very efficient, but it is a simple way to ensure that the contracts of equals are obeyed.

import java.util.Arrays;
import java.util.LinkedHashSet;
import java.util.Map;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class UniqueArraysTest {

    public static void main(String[] args) {
        Set<UnorderedIntArray> result = new LinkedHashSet<>();
        int[] x = { 1, 2, 3 };
        int[] y = { 2, 1, 3 };
        int[] z = { 2, 1, 3 };

        int[] u = { 1, 1, 1, 2, 3 };
        int[] v = { 1, 1, 1, 2, 3 };
        int[] w = { 1, 1, 1, 1, 3 };

        result.add(new UnorderedIntArray(x));
        result.add(new UnorderedIntArray(y));
        result.add(new UnorderedIntArray(z));
        result.add(new UnorderedIntArray(u));
        result.add(new UnorderedIntArray(v));
        result.add(new UnorderedIntArray(w));

        for (UnorderedIntArray a : result) {
            int[] array = a.getArray();
            System.out.println(Arrays.toString(array));
        }

    }

    static class UnorderedIntArray {
        private final int array[];

        UnorderedIntArray(int array[]) {
            this.array = array;
        }

        int[] getArray() {
            return array;
        }

        @Override
        public int hashCode() {
            return IntStream.of(array).sum();
        }

        @Override
        public boolean equals(Object object) {
            if (object == this) {
                return true;
            }
            if (object == null) {
                return false;
            }
            if (!(object instanceof UnorderedIntArray)) {
                return false;
            }
            UnorderedIntArray that = (UnorderedIntArray)object;
            if (this.array.length != that.array.length) {
                return false;
            }
            // This is very simple, but not very efficient. More efficient
            // solutions could be implemented, but they are not trivial...
            Map<Integer, Long> thisFrequencies = computeFrequencies(this.array);
            Map<Integer, Long> thatFrequencies = computeFrequencies(that.array);
            return thisFrequencies.equals(thatFrequencies);
        }

        private Map<Integer, Long> computeFrequencies(int array[]) {
            return Arrays.stream(array).boxed().collect(
                Collectors.groupingBy(Function.identity(), Collectors.counting()));
        }

        @Override
        public String toString() {
            return Arrays.toString(array);
        }

    }
}

For the input of

int[] x = { 1, 2, 3 };
int[] y = { 2, 1, 3 };
int[] z = { 2, 1, 3 };
int[] u = { 1, 1, 1, 2, 3 };
int[] v = { 1, 1, 1, 2, 3 };
int[] w = { 1, 1, 1, 1, 3 };

the output is the expected

[1, 2, 3]
[1, 1, 1, 2, 3]
[1, 1, 1, 1, 3]

It definitely feels hacky and wrong, but you could use a TreeSet with a custom Comparator. Depending on your needs this might actually work, but at least note that this is breaking the general contract of the Set interface.

class Demo {
    public static void main(String[] args) throws Exception {
        Set<int[]> result = new TreeSet<>(new Hack());
        int[] x = {1,2,3};
        int[] z = {2,1,3};
        int[] m = {2,1,3};

        result.add(x);
        result.add(z);
        result.add(m);

        for (int[] arr : result) {
            System.out.println(Arrays.toString(arr));
        }
    }
}

class Hack implements Comparator<int[]> {

    @Override
    public int compare(int[] e1, int[] e2) {
        int[] copy1 = Arrays.copyOf(e1, e1.length);
        int[] copy2 = Arrays.copyOf(e2, e2.length);
        Arrays.sort(copy1);
        Arrays.sort(copy2);
        return Arrays.compare(copy1, copy2);
    }
}

Output:

[1, 2, 3]

If you are still on Java 8 use this Hack implementation:

class Hack implements Comparator<int[]> {

    @Override
    public int compare(int[] e1, int[] e2) {
        int[] copy1 = Arrays.copyOf(e1, e1.length);
        int[] copy2 = Arrays.copyOf(e2, e2.length);
        Arrays.sort(copy1);
        Arrays.sort(copy2);
        int cmp = Integer.compare(copy1.length, copy2.length);
        if (cmp != 0) {
            return cmp;
        }
        for (int i = 0; i < copy1.length; i++) {
            cmp = Integer.compare(copy1[i], copy2[i]);
            if (cmp != 0) {
                return cmp;
            }
        }
        return 0;
    }
}

Tags:

Java

Arraylist