2.2 SyntaxBefore considering how gr

2.2 Syntax

Before considering how grammatical structure can be represented, analyzed and used, we should ask what basis we might have for considering a particular grammar “correct”, or a particular sentence “grammatical,” in the first place. Of course, these are primarily questions for linguistics proper, but the answers we give certainly have consequences for computational linguistics.

Traditionally, formal grammars have been designed to capture linguists' intuitions about well- formedness as concisely as possible, in a way that also allows generalizations about a particular language (e.g., subject-auxiliary inversion in English questions) and across languages (e.g., a consistent ordering of nominal subject, verb, and nominal object for declarative, pragmatically neutral main clauses). Concerning linguists' specific well-formedness judgments, it is worth noting that these are largely in agreement not only with each other, but also with judgments of non-linguists—at least for “clearly grammatical” and “clearly ungrammatical” sentences (Pinker 2007). Also the discovery that conventional phrase structure supports elegant compositional theories of meaning lends credence to the traditional theoretical methodology.

However, traditional formal grammars have generally not covered any one language comprehensively, and have drawn sharp boundaries between well-formedness and ill- formedness, when in fact people's (including linguists') grammaticality judgments for many sentences are uncertain or equivocal. Moreover, when we seek to process sentences “in the wild”, we would like to accommodate regional, genre-specific, and register-dependent variations in language, dialects, and erroneous and sloppy language (e.g., misspellings, unpunctuated run- on sentences, hesitations and repairs in speech, faulty constituent orderings produced by non- native speakers, and fossilized errors by native speakers, such as “for you and I”—possibly a product of schoolteachers inveighing against “you and me” in subject position). Consequently linguists' idealized grammars need to be made variation-tolerant in most practical applications.
The way this need has typically been met is by admitting a far greater number of phrase structure rules than linguistic parsimony would sanction—say, 10,000 or more rules instead of a few hundred. These rules are not directly supplied by linguists (computational or otherwise), but

rather can be “read off” corpora of written or spoken language that have been decorated by trained annotators (such as linguistics graduate students) with their basic phrasal tree structure. Unsupervised grammar acquisition (often starting with POS-tagged training corpora) is another avenue (see section 9), but results are apt to be less satisfactory. In conjunction with statistical training and parsing techniques, this loosening of grammar leads to a rather different conception of what comprises a grammatically flawed sentence: It is not necessarily one rejected by the grammar, but one whose analysis requires some rarely used rules.

As mentioned in section 1.2, the representations of grammars used in computational linguistics have varied from procedural ones to ones developed in formal linguistics, and systematic, tractably parsable variants developed by computationally oriented linguists. Winograd's shrdlu program, for example, contained code in his programmar language expressing,

To parse a sentence, try parsing a noun phrase (NP); if this fails, return NIL, otherwise try parsing a verb phrase (VP) next and if this fails, or succeeds with words remaining, return NIL, otherwise return success.

Similarly Woods' grammar for lunar was based on a certain kind of procedurally interpreted transition graph (an augmented transition network, or ATN), where the sentence subgraph might contain an edge labeled NP (analyze an NP using the NP subgraph) followed by an edge labeled VP (analogously interpreted). In both cases, local feature values (e.g., the number and person of a NP and VP) are registered, and checked for agreement as a condition for success. A closely related formalism is that of definite clause grammars (e.g., Pereira & Warren 1982), which employ Prolog to assert “facts” such as that if the input word sequence contains an NP reaching from index I1 to index I2 and a VP reaching from index I2 to index I3, then the input contains a sentence reaching from index I1 to index I3. (Again, feature agreement constraints can be incorporated into such assertions as well.) Given the goal of proving the presence of a sentence, the goal-chaining mechanism of Prolog then provides a procedural interpretation of these assertions.

At present the most commonly employed declarative representations of grammatical structure are context-free grammars (CFGs) as defined by Noam Chomsky (1956, 1957), because of their simplicity and efficient parsability. Chomsky had argued that only deep linguistic representations are context-free, while surface form is generated by transformations (for example, in English passivization and in question formation) that result in a non-context-free language. However, it was later shown that on the one hand, unrestricted Chomskian transformational grammars allowed for computationally intractable and even undecidable languages, and on the other, that the phenomena regarded by Chomsky as calling for a transformational analysis could be handled within a context-free framework by use of suitable features in the specification of syntactic categories. Notably, unbounded movement, such as the apparent movement of the final verb object to the front of the sentence in “Which car did Jack urge you to buy?”, was shown to be analyzable in terms of a gap (or slash) feature of type /NP[wh] that is carried by each of the two embedded VPs, providing a pathway for matching the category of the fronted object to the category of the vacated object position. Within non-transformational grammar frameworks, one therefore speaks of unbounded (or long-distance) dependencies instead of unbounded movement. At the same time it should be noted that at least some natural languages have been shown to be

mildly context-sensitive (e.g., Dutch and Swiss German exhibit cross-serial dependencies where a series of nominals “NP1 NP2 NP3 …” need to be matched, in the same order, with a subsequent series of verbs, “V1 V2 V3 …”). Grammatical frameworks that seem to allow for approximately the right degree of mild context sensitivity include Head Grammar, Tree- Adjoining Grammar (TAG), Combinatory Categorial Grammar (CCG), and Linear Indexed Grammar (LIG). Head grammars allow insertion of a complement between the head of a phrase (e.g., the initial verb of a VP, the final noun of a NP, or the VP of a sentence) and an already present complement; they were a historical predecessor of Head-Driven Phrase Structure Grammar (HPSG), a type of unification grammar (see below) that has received much attention in computational linguistics. However, unrestricted HPSG can generate the recursively enumerable (in general only semi-decidable) languages.

A typical (somewhat simplified) sample fragment of a context-free grammar is the following, where phrase types are annotated with feature-value pairs:

S[vform:v] → NP[pers:p numb:n case:subj] VP[vform:v pers:p numb:n] VP[vform:v pers:p numb:n] → V[subcat:_np vform:v pers:p numb:n] NP[case:obj] NP[pers:3 numb:n] → Det[pers:3 numb:n] N[numb:n]
NP[numb:n pers:3 case:c] → Name[numb:n pers:3 case:c]

Here v, n, p, c are variables that can assume values such as ‘past’, ‘pres’, ‘base’, ‘pastparticiple’,
… (i.e., various verb forms), ‘1’, ‘2’, ‘3’ (1st, 2nd, and 3rd person), ‘sing’, ‘plur’, and ‘subj’, ‘obj’. The subcat feature indicates the complement requirements of the verb. The lexicon would supply entries such as

V[subcat:_np vform:pres numb:sing pers:3] → loves
Det[pers:3 numb:sing] → a
N[pers:3 numb:sing] → mortal
Name[pers:3 numb:sing gend:fem case:subj] → Thetis,

allowing, for example, a phrase structure analysis of the sentence “Thetis loves a mortal” (where we have omitted the feature names for simplicity, leaving only their values, and ignored the case feature

2.2 Syntax

Before considering how grammatical structure can be represented, analyzed and used, we should ask what basis we might have for considering a particular grammar “correct”, or a particular sentence “grammatical,” in the first place. Of course, these are primarily questions for linguistics proper, but the answers we give certainly have consequences for computational linguistics.

Traditionally, formal grammars have been designed to capture linguists' intuitions about well- formedness as concisely as possible, in a way that also allows generalizations about a particular language (e.g., subject-auxiliary inversion in English questions) and across languages (e.g., a consistent ordering of nominal subject, verb, and nominal object for declarative, pragmatically neutral main clauses). Concerning linguists' specific well-formedness judgments, it is worth noting that these are largely in agreement not only with each other, but also with judgments of non-linguists—at least for “clearly grammatical” and “clearly ungrammatical” sentences (Pinker 2007). Also the discovery that conventional phrase structure supports elegant compositional theories of meaning lends credence to the traditional theoretical methodology.

However, traditional formal grammars have generally not covered any one language comprehensively, and have drawn sharp boundaries between well-formedness and ill- formedness, when in fact people's (including linguists') grammaticality judgments for many sentences are uncertain or equivocal. Moreover, when we seek to process sentences “in the wild”, we would like to accommodate regional, genre-specific, and register-dependent variations in language, dialects, and erroneous and sloppy language (e.g., misspellings, unpunctuated run- on sentences, hesitations and repairs in speech, faulty constituent orderings produced by non- native speakers, and fossilized errors by native speakers, such as “for you and I”—possibly a product of schoolteachers inveighing against “you and me” in subject position). Consequently linguists' idealized grammars need to be made variation-tolerant in most practical applications.
The way this need has typically been met is by admitting a far greater number of phrase structure rules than linguistic parsimony would sanction—say, 10,000 or more rules instead of a few hundred. These rules are not directly supplied by linguists (computational or otherwise), but
 
rather can be “read off” corpora of written or spoken language that have been decorated by trained annotators (such as linguistics graduate students) with their basic phrasal tree structure. Unsupervised grammar acquisition (often starting with POS-tagged training corpora) is another avenue (see section 9), but results are apt to be less satisfactory. In conjunction with statistical training and parsing techniques, this loosening of grammar leads to a rather different conception of what comprises a grammatically flawed sentence: It is not necessarily one rejected by the grammar, but one whose analysis requires some rarely used rules.

As mentioned in section 1.2, the representations of grammars used in computational linguistics have varied from procedural ones to ones developed in formal linguistics, and systematic, tractably parsable variants developed by computationally oriented linguists. Winograd's shrdlu program, for example, contained code in his programmar language expressing,

To parse a sentence, try parsing a noun phrase (NP); if this fails, return NIL, otherwise try parsing a verb phrase (VP) next and if this fails, or succeeds with words remaining, return NIL, otherwise return success.

Similarly Woods' grammar for lunar was based on a certain kind of procedurally interpreted transition graph (an augmented transition network, or ATN), where the sentence subgraph might contain an edge labeled NP (analyze an NP using the NP subgraph) followed by an edge labeled VP (analogously interpreted). In both cases, local feature values (e.g., the number and person of a NP and VP) are registered, and checked for agreement as a condition for success. A closely related formalism is that of definite clause grammars (e.g., Pereira & Warren 1982), which employ Prolog to assert “facts” such as that if the input word sequence contains an NP reaching from index I1 to index I2 and a VP reaching from index I2 to index I3, then the input contains a sentence reaching from index I1 to index I3. (Again, feature agreement constraints can be incorporated into such assertions as well.) Given the goal of proving the presence of a sentence, the goal-chaining mechanism of Prolog then provides a procedural interpretation of these assertions.

At present the most commonly employed declarative representations of grammatical structure are context-free grammars (CFGs) as defined by Noam Chomsky (1956, 1957), because of their simplicity and efficient parsability. Chomsky had argued that only deep linguistic representations are context-free, while surface form is generated by transformations (for example, in English passivization and in question formation) that result in a non-context-free language. However, it was later shown that on the one hand, unrestricted Chomskian transformational grammars allowed for computationally intractable and even undecidable languages, and on the other, that the phenomena regarded by Chomsky as calling for a transformational analysis could be handled within a context-free framework by use of suitable features in the specification of syntactic categories. Notably, unbounded movement, such as the apparent movement of the final verb object to the front of the sentence in “Which car did Jack urge you to buy?”, was shown to be analyzable in terms of a gap (or slash) feature of type /NP[wh] that is carried by each of the two embedded VPs, providing a pathway for matching the category of the fronted object to the category of the vacated object position. Within non-transformational grammar frameworks, one therefore speaks of unbounded (or long-distance) dependencies instead of unbounded movement. At the same time it should be noted that at least some natural languages have been shown to be
 
mildly context-sensitive (e.g., Dutch and Swiss German exhibit cross-serial dependencies where a series of nominals “NP1 NP2 NP3 …” need to be matched, in the same order, with a subsequent series of verbs, “V1 V2 V3 …”). Grammatical frameworks that seem to allow for approximately the right degree of mild context sensitivity include Head Grammar, Tree- Adjoining Grammar (TAG), Combinatory Categorial Grammar (CCG), and Linear Indexed Grammar (LIG). Head grammars allow insertion of a complement between the head of a phrase (e.g., the initial verb of a VP, the final noun of a NP, or the VP of a sentence) and an already present complement; they were a historical predecessor of Head-Driven Phrase Structure Grammar (HPSG), a type of unification grammar (see below) that has received much attention in computational linguistics. However, unrestricted HPSG can generate the recursively enumerable (in general only semi-decidable) languages.

A typical (somewhat simplified) sample fragment of a context-free grammar is the following, where phrase types are annotated with feature-value pairs:

S[vform:v] → NP[pers:p numb:n case:subj] VP[vform:v pers:p numb:n] VP[vform:v pers:p numb:n] → V[subcat:_np vform:v pers:p numb:n] NP[case:obj] NP[pers:3 numb:n] → Det[pers:3 numb:n] N[numb:n]
NP[numb:n pers:3 case:c] → Name[numb:n pers:3 case:c]

Here v, n, p, c are variables that can assume values such as ‘past’, ‘pres’, ‘base’, ‘pastparticiple’,
… (i.e., various verb forms), ‘1’, ‘2’, ‘3’ (1st, 2nd, and 3rd person), ‘sing’, ‘plur’, and ‘subj’, ‘obj’. The subcat feature indicates the complement requirements of the verb. The lexicon would supply entries such as

V[subcat:_np vform:pres numb:sing pers:3] → loves
Det[pers:3 numb:sing] → a
N[pers:3 numb:sing] → mortal
Name[pers:3 numb:sing gend:fem case:subj] → Thetis,

allowing, for example, a phrase structure analysis of the sentence “Thetis loves a mortal” (where we have omitted the feature names for simplicity, leaving only their values, and ignored the case feature

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

2.2 SyntaxBefore considering how grammatical structure can be represented, analyzed and used, we should ask what basis we might have for considering a particular grammar “correct”, or a particular sentence “grammatical,” in the first place. Of course, these are primarily questions for linguistics proper, but the answers we give certainly have consequences for computational linguistics.Traditionally, formal grammars have been designed to capture linguists' intuitions about well- formedness as concisely as possible, in a way that also allows generalizations about a particular language (e.g., subject-auxiliary inversion in English questions) and across languages (e.g., a consistent ordering of nominal subject, verb, and nominal object for declarative, pragmatically neutral main clauses). Concerning linguists' specific well-formedness judgments, it is worth noting that these are largely in agreement not only with each other, but also with judgments of non-linguists—at least for “clearly grammatical” and “clearly ungrammatical” sentences (Pinker 2007). Also the discovery that conventional phrase structure supports elegant compositional theories of meaning lends credence to the traditional theoretical methodology.However, traditional formal grammars have generally not covered any one language comprehensively, and have drawn sharp boundaries between well-formedness and ill- formedness, when in fact people's (including linguists') grammaticality judgments for many sentences are uncertain or equivocal. Moreover, when we seek to process sentences “in the wild”, we would like to accommodate regional, genre-specific, and register-dependent variations in language, dialects, and erroneous and sloppy language (e.g., misspellings, unpunctuated run- on sentences, hesitations and repairs in speech, faulty constituent orderings produced by non- native speakers, and fossilized errors by native speakers, such as “for you and I”—possibly a product of schoolteachers inveighing against “you and me” in subject position). Consequently linguists' idealized grammars need to be made variation-tolerant in most practical applications.The way this need has typically been met is by admitting a far greater number of phrase structure rules than linguistic parsimony would sanction—say, 10,000 or more rules instead of a few hundred. These rules are not directly supplied by linguists (computational or otherwise), but

rather can be “read off” corpora of written or spoken language that have been decorated by trained annotators (such as linguistics graduate students) with their basic phrasal tree structure. Unsupervised grammar acquisition (often starting with POS-tagged training corpora) is another avenue (see section 9), but results are apt to be less satisfactory. In conjunction with statistical training and parsing techniques, this loosening of grammar leads to a rather different conception of what comprises a grammatically flawed sentence: It is not necessarily one rejected by the grammar, but one whose analysis requires some rarely used rules.

As mentioned in section 1.2, the representations of grammars used in computational linguistics have varied from procedural ones to ones developed in formal linguistics, and systematic, tractably parsable variants developed by computationally oriented linguists. Winograd's shrdlu program, for example, contained code in his programmar language expressing,

To parse a sentence, try parsing a noun phrase (NP); if this fails, return NIL, otherwise try parsing a verb phrase (VP) next and if this fails, or succeeds with words remaining, return NIL, otherwise return success.

Similarly Woods' grammar for lunar was based on a certain kind of procedurally interpreted transition graph (an augmented transition network, or ATN), where the sentence subgraph might contain an edge labeled NP (analyze an NP using the NP subgraph) followed by an edge labeled VP (analogously interpreted). In both cases, local feature values (e.g., the number and person of a NP and VP) are registered, and checked for agreement as a condition for success. A closely related formalism is that of definite clause grammars (e.g., Pereira & Warren 1982), which employ Prolog to assert “facts” such as that if the input word sequence contains an NP reaching from index I1 to index I2 and a VP reaching from index I2 to index I3, then the input contains a sentence reaching from index I1 to index I3. (Again, feature agreement constraints can be incorporated into such assertions as well.) Given the goal of proving the presence of a sentence, the goal-chaining mechanism of Prolog then provides a procedural interpretation of these assertions.

At present the most commonly employed declarative representations of grammatical structure are context-free grammars (CFGs) as defined by Noam Chomsky (1956, 1957), because of their simplicity and efficient parsability. Chomsky had argued that only deep linguistic representations are context-free, while surface form is generated by transformations (for example, in English passivization and in question formation) that result in a non-context-free language. However, it was later shown that on the one hand, unrestricted Chomskian transformational grammars allowed for computationally intractable and even undecidable languages, and on the other, that the phenomena regarded by Chomsky as calling for a transformational analysis could be handled within a context-free framework by use of suitable features in the specification of syntactic categories. Notably, unbounded movement, such as the apparent movement of the final verb object to the front of the sentence in “Which car did Jack urge you to buy?”, was shown to be analyzable in terms of a gap (or slash) feature of type /NP[wh] that is carried by each of the two embedded VPs, providing a pathway for matching the category of the fronted object to the category of the vacated object position. Within non-transformational grammar frameworks, one therefore speaks of unbounded (or long-distance) dependencies instead of unbounded movement. At the same time it should be noted that at least some natural languages have been shown to be

mildly context-sensitive (e.g., Dutch and Swiss German exhibit cross-serial dependencies where a series of nominals “NP1 NP2 NP3 …” need to be matched, in the same order, with a subsequent series of verbs, “V1 V2 V3 …”). Grammatical frameworks that seem to allow for approximately the right degree of mild context sensitivity include Head Grammar, Tree- Adjoining Grammar (TAG), Combinatory Categorial Grammar (CCG), and Linear Indexed Grammar (LIG). Head grammars allow insertion of a complement between the head of a phrase (e.g., the initial verb of a VP, the final noun of a NP, or the VP of a sentence) and an already present complement; they were a historical predecessor of Head-Driven Phrase Structure Grammar (HPSG), a type of unification grammar (see below) that has received much attention in computational linguistics. However, unrestricted HPSG can generate the recursively enumerable (in general only semi-decidable) languages.

A typical (somewhat simplified) sample fragment of a context-free grammar is the following, where phrase types are annotated with feature-value pairs:

S[vform:v] → NP[pers:p numb:n case:subj] VP[vform:v pers:p numb:n] VP[vform:v pers:p numb:n] → V[subcat:_np vform:v pers:p numb:n] NP[case:obj] NP[pers:3 numb:n] → Det[pers:3 numb:n] N[numb:n]
NP[numb:n pers:3 case:c] → Name[numb:n pers:3 case:c]

Here v, n, p, c are variables that can assume values such as ‘past’, ‘pres’, ‘base’, ‘pastparticiple’,
… (i.e., various verb forms), ‘1’, ‘2’, ‘3’ (1st, 2nd, and 3rd person), ‘sing’, ‘plur’, and ‘subj’, ‘obj’. The subcat feature indicates the complement requirements of the verb. The lexicon would supply entries such as

V[subcat:_np vform:pres numb:sing pers:3] → loves
Det[pers:3 numb:sing] → a
N[pers:3 numb:sing] → mortal
Name[pers:3 numb:sing gend:fem case:subj] → Thetis,

allowing, for example, a phrase structure analysis of the sentence “Thetis loves a mortal” (where we have omitted the feature names for simplicity, leaving only their values, and ignored the case feature

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

2.2 Cú pháp Trước khi xem xét cách cấu trúc ngữ pháp có thể được đại diện, phân tích và sử dụng, chúng ta nên hỏi cơ sở những gì chúng tôi có thể có để xem xét một ngữ pháp cụ thể "đúng", hoặc một câu đặc biệt "ngữ pháp" ở nơi đầu tiên. Tất nhiên, đây là những yếu câu hỏi cho ngôn ngữ thích hợp, nhưng các câu trả lời, chúng tôi cung cấp cho chắc chắn có hậu quả cho ngôn ngữ học tính toán. Theo truyền thống, văn phạm tiếng chính thức đã được thiết kế để nắm bắt trực giác ngôn ngữ học 'về formedness an sinh như ngắn gọn càng tốt, trong một cách mà còn cho phép khái quát về một ngôn ngữ cụ thể (ví dụ, đối tượng phụ trợ đảo ngược trong câu hỏi tiếng Anh) và mọi ngôn ngữ (ví dụ, một trật tự nhất quán về chủ đề danh từ, động từ, và đối tượng danh nghĩa đối với khai báo, thực dụng trung tính khoản chính). Liên quan đến bản án nổi formedness cụ thể ngôn ngữ học ", cũng cần lưu ý rằng đây là phần lớn trong thỏa thuận này không chỉ với nhau, mà còn với bản án của phi ngôn ngữ học, ít nhất là cho" rõ ràng ngữ pháp "và" rõ ràng không đúng ngữ pháp "câu (Pinker 2007 ). Cũng phát hiện ra rằng cấu trúc cụm từ thông thường hỗ trợ các lý thuyết về thành phần thanh lịch của ý nghĩa chứng minh thêm cho các phương pháp lý thuyết truyền thống. Tuy nhiên, văn phạm tiếng chính thức truyền thống nói chung đã không được đề cập bất kỳ một ngôn ngữ toàn diện, và đã rút ra ranh giới rõ ràng giữa tốt và formedness formedness ill-, khi trong thực tế của nhân dân (bao gồm cả ngôn ngữ học ') Bản án grammaticality cho nhiều câu không chắc chắn hoặc không rõ ràng. Hơn nữa, khi chúng tôi tìm cách xử lý câu "trong tự nhiên", chúng tôi muốn để chứa khu vực, thể loại cụ thể, và đăng ký phụ thuộc vào các biến thể trong ngôn ngữ, tiếng địa phương, và các sai lầm và cẩu thả ngôn ngữ (ví dụ, lỗi chính tả, run- unpunctuated về câu , do dự và sửa chữa trong bài phát biểu, orderings thành phần bị lỗi được sản xuất bởi người bản xứ không, và các lỗi bị hóa thạch của người bản ngữ, chẳng hạn như "cho bạn và tôi" -possibly một sản phẩm của các giáo viên trường inveighing chống lại "bạn và tôi" ở vị trí đối tượng). Do đó ngữ pháp lý tưởng hóa ngôn ngữ học 'cần phải được thực hiện biến thể chịu trong các ứng dụng thực tế nhất. Cách cầu này đã thường được đáp ứng là bằng cách thừa nhận một số lượng lớn hơn các quy tắc cấu trúc cụm từ hơn sự cẩn thận về ngôn ngữ sẽ xử phạt vi-nói, 10.000 hoặc nhiều quy tắc thay vì một vài trăm. Những quy định này không được cung cấp trực tiếp bằng ngôn ngữ học (tính toán hay cách khác), nhưng thay vì có thể "đọc ra" corpora của ngôn ngữ viết hoặc nói rằng đã được trang trí bởi annotators đào tạo (như sinh viên ngôn ngữ học đại học) với cấu trúc mệnh đề như cây cơ bản của họ. Mua lại ngữ pháp không giám sát (thường bắt đầu với corpora đào tạo POS-tagged) là một con đường khác (xem phần 9), nhưng kết quả là apt là chưa thỏa đáng. Kết hợp với các kỹ thuật đào tạo và phân tích thống kê, nới lỏng này của ngữ pháp dẫn đến một quan niệm khá khác nhau với nội dung của một câu ngữ pháp sai lầm: Nó không nhất thiết phải là một từ chối bởi các ngữ pháp, nhưng người mà phân tích đòi hỏi một số quy tắc ít được sử dụng. Như đã đề cập ở mục 1.2, các cơ quan đại diện của ngữ pháp được sử dụng trong ngôn ngữ học tính toán đã thay đổi từ những thủ tục để phát triển về ngôn ngữ chính thức, và có hệ thống, biến thể tractably parsable phát triển bởi nhà ngôn ngữ học tính toán theo định hướng. Chương trình shrdlu Winograd, ví dụ, có chứa mã trong ngôn ngữ programmar mình thể hiện, để phân tích một câu, hãy thử phân tích một cụm danh từ (NP); nếu điều này không thành công, trở về NIL, nếu không cố gắng phân tích một cụm động từ (VP) tiếp theo và nếu điều này không, hay thành công với các từ còn lại, trở về NIL, nếu không trở thành công. Tương tự như vậy ngữ pháp Woods 'cho mặt trăng được dựa vào một loại nhất định của procedurally hiểu đồ thị chuyển tiếp (một mạng chuyển tiếp tăng cường, hoặc ATN), nơi mà các đồ thị con câu có thể chứa một cạnh có nhãn NP (phân tích một NP sử dụng các đồ thị con NP) tiếp theo là một cạnh có nhãn VP (Tương tự hiểu). Trong cả hai trường hợp, giá trị đặc trưng của địa phương (ví dụ, số lượng và người của một NP và VP) được đăng ký, và kiểm tra thỏa thuận là điều kiện để thành công. Một hình thức liên quan chặt chẽ là của văn phạm điều khoản nhất định (ví dụ, Pereira & Warren 1982), có sử dụng Prolog để khẳng định "sự kiện" như rằng nếu trình tự từ đầu vào có chứa một NP đạt từ chỉ số I1 đến chỉ số I2 và một VP đạt từ chỉ số I2 để index I3, sau đó đầu vào có chứa một câu đạt từ chỉ số I1 đến chỉ số I3. (Một lần nữa, tính năng hạn chế thỏa thuận có thể được kết hợp vào khẳng định như vậy là tốt.) Với mục đích chứng minh sự hiện diện của một câu, các cơ chế mục tiêu xâu chuỗi của Prolog sau đó cung cấp một giải thích về thủ tục của những khẳng định. Ở trình bày các dụng thường gặp nhất declarative đại diện của các cấu trúc ngữ pháp là văn phạm tiếng cảnh Việt (CFGs) theo quy định của Noam Chomsky (1956, 1957), vì đơn giản họ và parsability hiệu quả. Chomsky đã lập luận rằng đại diện ngôn ngữ chỉ sâu là bối cảnh tự do, trong khi hình thức bề mặt được tạo ra bởi biến đổi (ví dụ, trong passivization tiếng Anh và hình thành câu hỏi) mà kết quả trong một ngôn ngữ phi ngữ cảnh miễn phí. Tuy nhiên, sau đó được chỉ ra rằng một mặt, không giới hạn phạm tiếng chuyển đổi Chomskian phép cho các ngôn ngữ tính toán khó và thậm chí không thể quyết định, và mặt khác, rằng các hiện tượng coi của Chomsky như gọi điện thoại cho một phân tích chuyển đổi có thể được xử lý trong một cảnh Việt khung do sử dụng các tính năng phù hợp trong đặc điểm kỹ thuật của loại cú pháp. Đáng chú ý, phong trào vô biên, chẳng hạn như sự chuyển động biểu kiến của các đối tượng động từ cuối cùng để mặt trước của câu trong "Những chiếc xe đã Jack mong bạn để mua?", Được thể hiện là phân tích được trong các điều khoản của một khoảng cách (hoặc giảm) tính năng của loại / NP [wh] được thực hiện bởi một trong hai VPS nhúng, cung cấp một lộ trình cho phù hợp với thể loại của các đối tượng fronted cho danh mục của các vị trí đối tượng bỏ trống. Trong khuôn khổ ngữ pháp không biến đổi, một do đó nói về vô bờ (hoặc dài) cách phụ thuộc thay vì chuyển động không bị chặn. Đồng thời cũng cần lưu ý rằng ít nhất một số ngôn ngữ tự nhiên đã được chứng minh là có hơi bối cảnh nhạy cảm (ví dụ, triển lãm Đức phụ thuộc cross-serial Hà Lan và Thụy Sĩ, nơi một loạt các nominals "NP1 NP2 NP3 ..." cần phải được kết hợp , theo thứ tự, với một loạt tiếp theo của động từ, "V1 V2 V3 ..."). Khung ngữ pháp mà dường như cho phép xấp xỉ mức độ phải nhạy ngữ cảnh nhẹ bao gồm Head Grammar, Tree- liền kề Grammar (TAG), combinatory Categorial Grammar (CCG), và tuyến tính Indexed Grammar (LIG). Văn phạm tiếng đầu cho phép chèn một sự bổ sung giữa người đứng đầu của một cụm từ (ví dụ, động từ ban đầu của một VP, danh từ cuối cùng của một NP, hoặc các VP của một câu) và một bổ sung đã có mặt; họ là một người tiền nhiệm lịch sử của Head-Driven cấu Phrase Grammar (HPSG), một loại thống nhất ngữ pháp (xem dưới đây) và đã nhận được nhiều sự chú ý trong ngôn ngữ học tính toán. Tuy nhiên, HPSG không hạn chế có thể tạo ra (chỉ bán decidable nói chung) ngôn ngữ đệ quy đếm được. Một điển hình (hơi đơn giản) mẫu mảnh vỡ của một ngữ pháp cảnh Việt ngữ là sau đây, nơi các loại cụm từ được chú thích với các cặp tính năng có giá trị: S [ vform: v] → NP [pers: p tê: n trường hợp: subj] VP [vform: v pers: p tê: n] VP [vform: v pers: p tê: n] → V [subcat: _np vform: v pers: p tê: n] NP [trường hợp: obj] NP [pers: 3 tê: n] → Det [pers: 3 tê: n] N [tê: n] NP [tê: n pers: 3 trường hợp: c] → Name [tê: n pers: 3 trường hợp: c] Đây v, n, p, c là các biến mà có thể giả định các giá trị như "quá khứ", "pres ',' cơ sở ',' pastparticiple ', ... (tức là khác nhau các hình thức động từ), '1', '2', '3' (1, 2, và 3 người), 'hát', 'plur', và 'subj', 'obj'. Các tính năng subcat chỉ ra các yêu cầu bổ sung của động từ. Từ vựng sẽ cung cấp các mục như V [subcat: _np vform: pres tê: hát pers: 3] → yêu Det [pers: 3 tê: hát] → một N [pers: 3 tê: hát] → chết Name [pers: 3 tê: hát gend: trường hợp fem: subj] → Thetis, cho phép, ví dụ, một phân tích cấu trúc cụm từ của câu "Thetis yêu một sinh tử" (nơi mà chúng tôi đã bỏ qua các tên các tính năng để đơn giản, chỉ để lại các giá trị của họ, và bỏ qua các trường hợp tính năng

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.