NLP to find relationship between entities

Yes absolutely. This is called Relation Extraction. Stanford has developed several useful tools for working on this problem.

Here is there website: http://deepdive.stanford.edu/relation_extraction Here is the github repository: https://github.com/philipperemy/Stanford-OpenIE-Python

In general here is how the process works.

results = entract_entity_relations("Barack Obama was born in Hawaii.")
print(results)
# [['Barack Obama','was born in', 'Hawaii']]

Of some importance is that only triples are extracted of the form (subject,predicate,object).


You can extract verbs with their dependants using Stanford Parser, for example. E.g., you might get "dependency chains" like

"I :: spent :: at :: CERN". 

It is a much tougher task to recognise that "I spent at CERN" and "I visited CERN" and "CERN hosted my visit" (etc) denote the same kind of event. Going into how this can be done is beyond the scope of an SO question, but you can read up literature of paraphrases recognition (here is one overview paper). There is also a related question on SO.

Once you can cluster similar chains, you'd need to find a way to label them. You could simply choose the verb of the most common chain in a cluster.

If, however, you have a pre-defined set of relation types you want to extract and lots of texts manually annotated for these relations, then the approach could be very different, e.g., using machine learning to learn how to recognize a relation type based on annotated data.


Don't know if you're still interested but CoreNLP added a new annotator called OpenIE (Open Information Extraction), which should accomplish what you're looking for. Check it out: OpenIE