Regex pattern including all special characters

You have a dash in the middle of the character class, which will mean a character range. Put the dash at the end of the class like so:

[$&+,:;=?@#|'<>.^*()%!-]

SInce you don't have white-space and underscore in your character class I think following regex will be better for you:

Pattern regex = Pattern.compile("[^\w\s]");

Which means match everything other than [A-Za-z0-9\s_]

Unicode version:

Pattern regex = Pattern.compile("[^\p{L}\d\s_]");

That's because your pattern contains a .-^ which is all characters between and including . and ^, which included digits and several other characters as shown below:

enter image description here

If by special characters, you mean punctuation and symbols use:

[\p{P}\p{S}]

which contains all unicode punctuation and symbols.


Please don't do that... little Unicode BABY ANGELs like this one 👼 are dying! ◕◡◕ (← these are not images) (nor is the arrow!)

And you are killing 20 years of DOS :-) (the last smiley is called WHITE SMILING FACE... Now it's at 263A... But in ancient times it was ALT-1)

and his friend

BLACK SMILING FACE... Now it's at 263B... But in ancient times it was ALT-2

Try a negative match:

Pattern regex = Pattern.compile("[^A-Za-z0-9]");

(this will ok only A-Z "standard" letters and "standard" 0-9 digits.)

Tags:

Java

Regex