How to do word counts for a mixture of English and Chinese in Javascript

Try a regex like this:

/[\u00ff-\uffff]|\S+/g

For example, "I am a 香港人".match(/[\u00ff-\uffff]|\S+/g) gives:

["I", "am", "a", "香", "港", "人"]

Then you can just check the length of the resulting array.

The \u00ff-\uffff part of the regex is a unicode character range; you probably want to narrow this down to just the characters you want to count as words. For example, CJK Unified would be \u4e00-\u9fcc.

function countWords(str) {
    var matches = str.match(/[\u00ff-\uffff]|\S+/g);
    return matches ? matches.length : 0;
}