Parsing is a process of "using a grammar to assign a syntactic analysis to a string of words, a lattice of word hypotheses output by a speech recognizer" (Carroll, 2003, p. 233). In MTK, we use two types of grammar: constituency and  dependency.

Constituency grammar

The fundamental idea of constituency is that groups of words form a single unit or phrase, called constituent (Jurafsky & Martin, 2000).

Constituency grammar describes the syntactical structure of the sentences in terms of phrasal hierarchies.


Dependency grammar


Dependency grammars focus on the direct relations between words in a particular sentence.



Parsing and formal languages

The phrase-based approach identifies phrases and structural categories in a given sentence. Analysing the sentence structure through the lens of a constituency grammar, we might be able to extract the relevant information of the phrase-boundaries helping in identification of concepts.

Dependency grammar, on the other hand, seems to be significant in identifying the relationshipsParsing between concepts and attributes of a particular concept. The reason is its ability to discover head-based relations (e.g. verb as a head), functional categories (e.g. subject, direct object, complement of a preposition, and others).


In the context of natural language, the core items important for SBVR are the verb and its relation to the subject/actor and object. Identifying the verb using a constituency approach is possible. However, some cases such as passive constructions might cause problems. Furthermore, identifying the correct subject and object often fails with constituency grammar, when the sentences are longer or the subject appears after the verb. Dependency grammar focuses on the verb identi cation and the dependencies between different parts of the sentence. In the MTK, we built an interface, that uses the results produced by the Dependency Grammar (Stanford Parser) and extracts verbs, subjects, and objects that are in some relation  to this verb (= head word). First tests have shown that even the passive constructions such
as "The products have been bought by the company" have been processed correctly.