Regex match Arabic keyword

I used this ء-ي٠-٩ and it works for me


We have first to understand what does \b mean:

\b is an anchor that matches at a position that is called a "word boundary".

In your case, the word boundaries that you are looking for are not having other Arabic letters.

To match only Arabic letters in Regex, we use unicode:

[\u0621-\u064A]+

Or we can simply use Arabic letters directly

[ء-ي]+

The code above will match any Arabic letters. To make a word boundary out of it, we could simply reverse it on both sides:

[^ء-ي]ARABIC TEXT[^ء-ي]

The code above means: don't match any Arabic characters on either sides of an Arabic word which will work in your case.

Consider this example that you gave us which I modified a little bit:

 أنا أحب رياضتي رياض رياضة رياضيات وأنا سعيد حقا هنا 

If we are trying to match only رياض, this word will make our search match also رياضة, رياضيات, and رياضتي. However, if we add the code above, the match will successfully be on رياض only.

var x = " أنا أحب رياضتي رياض رياضة رياضيات وأنا سعيد حقا هنا ";
x = x.replace(/([^ء-ي]رياض[^ء-ي])/g, '<span style="color:red">$1</span>');
document.write (x);

If you would like to account for أآإا with one code, you could use something like this [\u0622\u0623\u0625\u0627] or simply list them all between square brackets [أآإا]. Here is a complete code

var x = "أنا هنا وانا هناك .. آنا هنا وإنا هناك";
x = x.replace(/([أآإا]نا)/g, '<span style="color:red">$1</span>');
document.write (x);

Note: If you want to match every possible Arabic characters in Regex including all Arabic letters أ ب ت ث ج, all diacritics َ ً ُ ٌ ِ ٍ ّ, and all Arabic numbers ١٢٣٤٥٦٧٨٩٠, use this regex: [،-٩]+

Useful link about the ranking of Arabic characters in Unicode: https://en.wikipedia.org/wiki/Arabic_script_in_Unicode


This doesn't work because of the Arabic language which isn't supported on the regex engine. You could search for the unicode chars in the text (Unicode ranges).

Or you could use encoding to convert the text into unicode and then make somehow the regex (i never have tried this but it should work).