Characters like 'é'
can be expressed either as a single code point or as a cluster of the letter 'e'
and a combining
accent mark. Without the CANON_EQ
flag, a regex will only match a string in which the characters are expressed in the same way.
String s = "e\u0300"; Pattern p = Pattern.compile("é|ë|è"); // Noncompliant System.out.println(p.matcher(s).replaceAll("e")); // print 'é'
String s = "e\u0300"; Pattern p = Pattern.compile("é|ë|è", Pattern.CANON_EQ); System.out.println(p.matcher(s).replaceAll("e")); // print 'e'