Case and diacritic insensitive matching of regex with metacharacter in Swift

It might be worthwhile to go in a different direction. Instead of flattening the input, what if you changed the regex?

Instead of matching against hate.you, could match against [h][åæaàâä][t][ëèêeé].[y][o0][ùu], for example (it's not a comprehensive list, in any case). It would make most sense to do this transformation on the fly (not storing it) because it might be easier if you need to change what the characters expand to later.

This will give you some more control over what characters will match. If you look, I have 0 as a character matching o. No amount of Unicode coercion could let you do that.


I ended up using the solution suggested by Laurel. It works well for me.

I post it here for anybody who might need it.

extension String {
    func getCaseAndDiacriticInsensitiveRegex() throws -> NSRegularExpression {
        var pattern = self.folding(options: [.caseInsensitive, .diacriticInsensitive], locale: .current)
        pattern = pattern.replacingOccurrences(of: "a", with: "[aàáâäæãåā]")
        pattern = pattern.replacingOccurrences(of: "c", with: "[cçćč]")
        pattern = pattern.replacingOccurrences(of: "e", with: "[eèéêëēėę]")
        pattern = pattern.replacingOccurrences(of: "l", with: "[lł]")
        pattern = pattern.replacingOccurrences(of: "i", with: "[iîïíīįì]")
        pattern = pattern.replacingOccurrences(of: "n", with: "[nñń]")
        pattern = pattern.replacingOccurrences(of: "o", with: "[oôöòóœøōõ]")
        pattern = pattern.replacingOccurrences(of: "s", with: "[sßśš]")
        pattern = pattern.replacingOccurrences(of: "u", with: "[uûüùúū]")
        pattern = pattern.replacingOccurrences(of: "y", with: "[yýÿ]")
        pattern = pattern.replacingOccurrences(of: "z", with: "[zžźż]")
        return try NSRegularExpression(pattern: pattern, options: [.caseInsensitive])
    }
}