Check whether a string contains Japanese/Chinese characters

You can use this code and it's works for me.

let str = "渣打銀行提供一系列迎合你生活需要嘅信用卡";
//let str = "SGGRAND DING HOUSE 4GRAND DING HOUSE";
const REGEX_CHINESE = /[\u3040-\u30ff\u3400-\u4dbf\u4e00-\u9fff\uf900-\ufaff\uff66-\uff9f]/;
const hasChinese = str.match(REGEX_CHINESE);
if(hasChinese){
  alert("Found");
}
else{
  alert("Not Found");
}

The ranges of Unicode characters which are routinely used for Chinese and Japanese text are:

U+3040 - U+30FF: hiragana and katakana (Japanese only)
U+3400 - U+4DBF: CJK unified ideographs extension A (Chinese, Japanese, and Korean)
U+4E00 - U+9FFF: CJK unified ideographs (Chinese, Japanese, and Korean)
U+F900 - U+FAFF: CJK compatibility ideographs (Chinese, Japanese, and Korean)
U+FF66 - U+FF9F: half-width katakana (Japanese only)

As a regular expression, this would be expressed as:

/[\u3040-\u30ff\u3400-\u4dbf\u4e00-\u9fff\uf900-\ufaff\uff66-\uff9f]/

This does not include every character which will appear in Chinese and Japanese text, but any significant piece of typical Chinese or Japanese text will be mostly made up of characters from these ranges.

Note that this regular expression will also match on Korean text that contains hanja. This is an unavoidable result of Han unification.

swift 4, changed the pattern to and NSRegularExpression for replace, maybe might help someone!

[\u{3040}-\u{30ff}\u{3400}-\u{4dbf}\u{4e00}-\u{9fff}\u{f900}-\u{faff}\u{ff66}-\u{ff9f}]

extension method

mutating func removeRegexMatches(pattern: String, replaceWith: String = "") {
        do {
            let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpression.Options.caseInsensitive)
            let range = NSMakeRange(0, self.count)
            self = regex.stringByReplacingMatches(in: self, options: [], range: range, withTemplate: replaceWith)
        } catch {
            return
        }
    }

    mutating func removeEastAsianChars() {
        let regexPatternEastAsianCharacters = "[\u{3040}-\u{30ff}\u{3400}-\u{4dbf}\u{4e00}-\u{9fff}\u{f900}-\u{faff}\u{ff66}-\u{ff9f}]"
        removeRegexMatches(pattern: regexPatternEastAsianCharacters)
    }

example, string result is ABC

"ABC検診センター".removeEastAsianChars()

Check whether a string contains Japanese/Chinese characters

Tags:

Javascript

Regex

Related

Recent Posts