How to extract all the emojis from text?

I think it's important to point out that the previous answers won't work with emojis like ð¨‍ð©‍ð¦‍ð¦ , because it consists of 4 emojis, and using ... in emoji.UNICODE_EMOJI will return 4 different emojis. Same for emojis with skin color like ðð½.

My solution

Include the emoji and regex modules. The regex module supports recognizing grapheme clusters (sequences of Unicode codepoints rendered as a single character), so we can count emojis like ð¨‍ð©‍ð¦‍ð¦

import emoji
import regex

def split_count(text):

    emoji_list = []
    data = regex.findall(r'\X', text)
    for word in data:
        if any(char in emoji.UNICODE_EMOJI['en'] for char in word):
            emoji_list.append(word)
    
    return emoji_list

Testing

with more emojis with skin color:

line = ["ð¤ ð me así, se ð ds ððð hello ð©ð¾‍ð emoji hello ð¨‍ð©‍ð¦‍ð¦ how are ð you todayðð½ðð½"]

counter = split_count(line[0])
print(' '.join(emoji for emoji in counter))

output:

ð¤ ð ð ð ð ð ð©ð¾‍ð ð¨‍ð©‍ð¦‍ð¦ ð ðð½ ðð½

Include flags

If you want to include flags, like ðµð° the Unicode range would be from ð¦ to ð¿, so add:

flags = regex.findall(u'[\U0001F1E6-\U0001F1FF]', text)

to the function above, and return emoji_list + flags.

See this answer to "A python regex that matches the regional indicator character class" for more information about the flags.

For newer `emoji` versions

to work with emoji >= v1.2.0 you have to add a language specifier (e.g. en as in above code):

emoji.UNICODE_EMOJI['en']

You can use the emoji library. You can check if a single codepoint is an emoji codepoint by checking if it is contained in emoji.UNICODE_EMOJI.

import emoji

def extract_emojis(s):
  return ''.join(c for c in s if c in emoji.UNICODE_EMOJI['en'])

How to extract all the emojis from text?

My solution

Testing

Include flags

For newer `emoji` versions

Tags:

Python

Python 3.X

Emoji

Related

Recent Posts

How to extract all the emojis from text?

My solution

Testing

Include flags

For newer emoji versions

Tags:

Python

Python 3.X

Emoji

Related

For newer `emoji` versions