Check if a sublist contains an item

For a list that contains some lists and some integers, you need to test whether the element i is a list before testing whether the search target is in i.

>>> any(2 in i for i in a if isinstance(i, list))
True
>>> any(8 in i for i in a if isinstance(i, list))
False

If you don't check whether i is a list, then you'll get an error like below. The accepted answer is wrong, because it gives this error.

>>> any(8 in i for i in a)
Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    any(8 in i for i in a)
  File "<pyshell#3>", line 1, in <genexpr>
    any(8 in i for i in a)
TypeError: argument of type 'int' is not iterable

>>> a = [[1,2],[3,4],[5,6],7,8,9]
>>> any(2 in i for i in a)
True

I think this type of situation is where we can take some inspiration from functional programming by delegating the evaluation of the boolean expression to its own function. This way, if you need to change the behaviour of your bool condition, you only need to change that function definition!

Let's say you want to check sublists AND also int that happen to be in the top level. We can define a function that returns a boolean when performing a comparison on a single list element:

def elem(a, b):
    '''
    Defines if an object b matches a.
    '''

    return (isinstance(b, int) and a == b) or (isinstance(b, list) and a in b)

Note here that this function says nothing about our list - the argument b in our usage is just a single element within the list, but we could just as easily call it just to compare two values. Now we have the following:

>>> a = [[1,2],[3,4],[5,6],7,8,9]

>>> any(elem(2, i) for i in a)
True

>>> any(elem(8, i) for i in a)
True

>>> any(elem(10, i) for i in a)
False

Bingo! Another benefit of this type of definition is that it allows you to partially apply functions, and gives you the ability to assign names to searches for only one type of number:

from functools import partial

>>> contains2 = partial(elem, 2)

>>> any(map(contains2, a))
True

>>> b = [[1],[3,4],[5,6],7,8,9]]
>>> any(map(contains2, b))
False

In my opinion, this makes for more readable code at the cost of a bit of boilerplate and the need to know what map does - since you can make your variable names sensical rather than a jungle of temporary list comprehension variables. I don't particularly care if the functional approach is less Pythonic - Python is a multiparadigm language and I think it looks better this way, plain and simple. But that's personal choice - it's up to you.

Now let's say that our situation has changed and we now want to check only the sublists - it's not enough for an occurrence in a top level. That's okay because now all we need to change is our definition of elem. Let's see:

def elem(a, b):
    return isinstance(b, list) and a in b

We've just removed the possibility of a match in the case that b is a top level int! If we run this now:

>>> a = [[1,2],[3,4],[5,6],7,8,9,"a",["b","c"]]
>>> any(elem(2, i) for i in a)
True
>>> any(elem(8, i) for i in a)
False

I'll illustrate one final example that really drives home how powerful this type of definition is. Suppose we have a list of arbitrarily deeply nested lists of integers. How do we check if an integer is in any of the levels?

We can take a recursive approach - and it doesn't take much modification at all:

def elem(a, b):
    return (isinstance(b, int) and a == b) or \
           (isinstance(b, list) and any(map(partial(elem, a), b)))

Because we've used this recursive definition that is defined to act on a single element, all the previous lines of code used still work:

>>> d = [1, [2, [3, [4, 5]]]]
>>> any(elem(1, i) for i in d)
True
>>> any(elem(4, i) for i in d)
True
>>> any(elem(10, i) for i in d)
False
>>> any(map(contains2, d))
True

Of course, given that this function is now recursive, we can really just call it directly:

>>> elem(4, d)
True

But the point remains that this modular approach has allowed us to alter the functionality by only changing the definition of elem without touching our main script, which means less TypeErrors and quicker refactoring.

Tags:

Python

List