Approaching Through Synthetic Structure and Grammar

The grammar of tree connection was also built for Vietnamese in [22] by extracting from the Vietnamese treebank. In terms of language representation, tree-connected grammars are capable of representing context-sensitive languages. This approach is effective when the Vietnamese treebank is large enough.

1.2. approach through stroke structure and unified grammar unified grammar is built on the basis of 1

1.2. Approach through stroke structure and unified grammar

Unified grammar is built on the basis of merging feature structures. The stroke structure is represented through the Attribute Value Matrix (AVM) of the form:

 Stroke 1 Value1

Maybe you are interested!

Stroke 2 Value2

 … . . .

Stroke n Value n

For example, a noun structure in English describes the features of a noun: Type - noun, Number - Few, Person - 3 as follows:

CAT NP

NUMBER SG

PERSON 3

The stroke structure is defined as the mapping F → VF, F is the set of strokes, VF is the set of values ​​that can be assigned to the strokes.

The above example is a stroke structure on the stroke set F = { CAT, NUMBER, PERSON }, the value set VF = { NP, SG, 3 }.

The incremental grammar contains the rules of the form A → X1…Xn where A is the name of the parent stroke structure, X1, …Xn is the child stroke structure.

Rules in additive grammar are represented by a stroke structure containing variables, so that the rule can be applied to many different situations. For example, the rule of addition for a simple noun phrase:

(NP  NUMBER  ?n) → (ART  NUMBER  ?n) (N  NUMBER  ?n )

represents the numerical unity of articles and nouns.

If the stroke can be represented as a line graph, then the stroke graphs can be merged into one large graph. It is the main component of unified grammar.

Unified grammar is a tool that can represent language class 0 which is the largest language class according to Chomsky's hierarchy [63]. According to Tran Ngoc Tuan's group [26], using unified grammar can solve some phenomena in Vietnamese such as the association of some words. Words can join together only when a conjugation that unites their strokes can be made. For example, the word “book” with the SHAPE: square/thin stroke is associated only with objects that have the same SHAPE stroke description, such as “book”. However, the detailed description for most of the phenomena of Vietnamese grammar to build a specific analyzer is too complicated. The authors of [26] only deal with a subset of Vietnamese nouns.

1.3. Dependency approach

1.3.1. Some concepts

Dependent grammar has its origins in the ancient Indian language Panini, the modern model introduced by Lucien Tesnière [75]. The study of dependent grammar flourished in Slavic languages ​​[92], Turkish due to the free characterization of word order.

An important point in the dependency grammar model is an asymmetric relationship called a  dependency  (or  dependency  - dependency) relationship. The dependent relationship that occurs between a  dependent word  and another word on which it depends is called  the head word  .

The dependency grammar uses two alphabets: the  terminating symbol set  and the  auxiliary symbol set.

Each element of  the terminating symbol set  is a smallest syntactic unit (prime unit), i.e. morpheme (in morphologically modified languages), pronunciation, or word... The utterance is considered as a string of elements of the terminating symbol set.

The auxiliary symbol set  is the set of occurrence type names of the terminating symbols. Complementary symbols are not allowed to be ambiguous; Each symbol has fixed syntactic properties .

There are different models of dependency grammars. The first model was formally described by Hays [62] and Gaifman [57].

Definition 1.3 . [57]

The dependent grammar  is a set of four components DG = ( L, C, F, R ), where

L: Terminal alphabet.

C: The auxiliary alphabet.

F: L → C assignment function.

R: The set of rules depends on one of the following three forms:

  1. Xi(Xj1, Xj2,… ,*, …, Xjn), where Xi is the central word, Xj1, Xj2,…, Xjn are the dependent words, n is a number. The order of words in rule 1 is the order in which they appear in the sentence (there may be interjections between the words mentioned in the rule). The * marks the position of the central word when standing with its dependent words in the utterance.
  2. Xi (*), indicating that the terminator for Xi can appear without the dependent word.
  3. *(Xi), indicates that the unit corresponding to Xi can occur without a central word. This object is the center of the utterance in which it appears.

For example:

Grammar DG = ( L, C, F, R )

L = { John, loves, a, woman }

C = { N, V, Det }

F: John → N, woman → N, loves → V, a → Det

R includes the rules:

  1. *(V)
  2. V(N, *, N)
  3. N(Det, *)
  4. N(*)
  5. Det(*)

Usually, a ROOT word is added to easily handle objects like V. The sentence “ John loves a woman ” can be represented as a tree as shown in Figure 1.4 below:

figure 1.4 . analysis of the sentence “ john loves a woman ” in a dependent grammar model in 2

Figure 1.4 . Analysis of the sentence “ John loves a woman ” in a dependent grammar model

In relation to dependent grammars, there are several important concepts and properties that will be discussed below.

The definitions below are taken from [75]

Definition 1.4.

A sentence is a sequence of prefixes (words) represented by S = w0w1…wn

For simplicity, assume that the sequence w1,…wn is a sequence of different words, for example in the sentence “ Mary saw John and Fred saw Susan ”, two different instances of the word “ saw ” are considered distinct.

Definition 1.5.

Suppose R = { r1, … , rm } is a finite set of   possible dependencies between two words in a sentence. The relation type  r  R is called  the label  of the arc,

Definition 1.6.

The dependency graph  G = (V, A) is a directed graph consisting of a vertex set V and an arc set A such that for the sentence S = w0w1…wn and the label set R, the following statements are true:

  • V ⊆ { w0, w1, … wn }.
  • A ⊆ V × R × V.
  • If (wi , r, wj) ∈ A then (wi . r',wj) ∉A for all r'≠ r.

Example : The dependency graph of the sentence " Economic news had little effect on financial market " in Figure 1.5.

figure 1.5 . dependency graph of the sentence “ economic news had little effect on financial 3

Figure 1.5 . Dependency graph of the sentence “ Economic news had little effect on financial market

G = (V, A)

V = VS = { ROOT, Economic, news, had, little, effect, on , financial, markets }

A = { (ROOT, PRED, had), (had, SBJ, news), (had, OBJ, effect), (had, PU,.), (news, ATT, Economic), (effect, ATT, little) , (effect, ATT, on), (on, PC, market), (market, ATT, financial) }

The definition of dependencies (wi , r , wj ) is not unique but  varies across different linguistic theory systems .

Definition 1.7.

The correct dependency graph G = (V, A) of the sentence on S and the set of dependencies R is a tree-shaped, directed dependency graph that comes from node w0 and has a set of frame nodes.

V = VS. We call this dependency graph  the dependency tree .

Send Message


Agree Privacy Policy *