Having both NER and RegexNER tags in StanfordCoreNLPServer output?

Here's what the RegexNER documentation says about this:

RegexNER will not overwrite an existing entity assignment, unless you give it permission in a third tab-separated column, which contains a comma-separated list of entity types that can be overwritten. Only the non-entity O label can always be overwritten, but you can specify extra entity tags which can always be overwritten as well.

Bachelor of (Arts|Laws|Science|Engineering|Divinity) DEGREE

Lalor LOCATION PERSON

Labor ORGANIZATION

I'm not sure what your mapping file exactly looks like, but if it just maps entities to labels, then the original NER will label your entities as NUMBER, and RegexNER won't be able to overwrite them. If you explicitly declare that some NUMBER entities should be overwritten as SURFACE in your mapping file, then it should work.


Ok, things seem to work as I want if I put the regexner first:

"annotators":"regexner,tokenize,ssplit,pos,ner",

seems there is an ordering problem at some stage of the process?