Most of the recent corpusbased POS

Most of the recent corpusbased POS taggers in
the literature are either statistically based, and
use Markov Model(Weischedel et al., 1993,
Merialdo, 1994) or Statistical Decision
Tree(Jelinek et al., 1994, Magerman, 1995)(SDT)
techniques, or are primarily rule based,
such as Drill's Transformation Based
Learner(Drill, 1994)(TBL). The Maximum
Entropy (MaxEnt) tagger presented in this paper
combines the advantages of all these methods. It
uses a rich feature representation, like TBL and
SDT, and generates a tag probability distribution
for each word, like Decision Tree and Markov
Model techniques.
5The mapping from article to annotator is in the
file doc/wsj .wht on the Treebank CDROM.
6The single-annotator training data was obtained
by extracting those articles tagged by "maryann" in
the Treebank v.5 CDROM. This training data does
not overlap with the Development and Test set used
in the paper. The single-annotator Development Set
is the portion of the Development Set which has also
been annotated by "maryann". The word vocabulary
and tag dictionary are the same as in the baseline
experiment.

Most of the recent corpusbased POS taggers in 
the literature are either statistically based, and 
use Markov Model(Weischedel et al., 1993, 
Merialdo, 1994) or Statistical Decision 
Tree(Jelinek et al., 1994, Magerman, 1995)(SDT) 
techniques, or are primarily rule based, 
such as Drill's Transformation Based 
Learner(Drill, 1994)(TBL). The Maximum 
Entropy (MaxEnt) tagger presented in this paper 
combines the advantages of all these methods. It 
uses a rich feature representation, like TBL and 
SDT, and generates a tag probability distribution 
for each word, like Decision Tree and Markov 
Model techniques. 
5The mapping from article to annotator is in the 
file doc/wsj .wht on the Treebank CDROM. 
6The single-annotator training data was obtained 
by extracting those articles tagged by "maryann" in 
the Treebank v.5 CDROM. This training data does 
not overlap with the Development and Test set used 
in the paper. The single-annotator Development Set 
is the portion of the Development Set which has also 
been annotated by "maryann". The word vocabulary 
and tag dictionary are the same as in the baseline 
experiment.

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

Hầu hết các taggers corpusbased POS tại trong Các tài liệu hoặc là thống kê dựa trên, và sử dụng Markov Model(Weischedel et al., 1993, Merialdo, 1994) hoặc quyết định thống kê Cây (Jelinek et al., năm 1994, Magerman, 1995)(SDT) kỹ thuật, hoặc là chủ yếu là cai trị dựa, chẳng hạn như khoan của chuyển đổi dựa Người học (khoan, 1994)(TBL). Tối đa Dữ liệu ngẫu nhiên (MaxEnt) tagger trình bày trong bài báo này kết hợp những lợi thế của tất cả những phương pháp này. Nó sử dụng một đại diện tính năng phong phú, như TBL và SDT, và tạo ra một phân phối xác suất từ khóa Đối với mỗi từ, như Decision Tree và Markov Kỹ thuật mô hình. Ánh xạ 5The từ bài viết để annotator là trong các file doc/wsj .wht trên Treebank CDROM. 6The đào tạo đơn-annotator dữ liệu được thu được bằng cách chiết các bài viết được gắn thẻ của "maryann" trong Treebank v.5 CDROM. Dữ liệu đào tạo này có không chồng chéo với bộ phát triển và thử nghiệm trong bài báo. Đĩa đơn-annotator phát triển Set là phần của các phát triển Set có cũng được chú thích bởi "maryann". Từ vựng và thẻ từ điển là tương tự như trong đường cơ sở thử nghiệm.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

Hầu hết các miếng sắt mõng POS corpusbased gần đây trong
các tài liệu hoặc là dựa trên thống kê và
sử dụng Markov Model (Weischedel et al., 1993,
Merialdo, 1994) hoặc Quyết định thống kê
Tree (Jelinek et al., 1994, Magerman, 1995) (SDT)
kỹ thuật, hoặc được chủ yếu dựa trên nguyên tắc,
chẳng hạn như chuyển đổi Dựa khoan của
người học (khoan, 1994) (TBL). Maximum
Entropy (Maxent) tagger trình bày trong bài viết này
kết hợp những lợi thế của tất cả các phương pháp này. Nó
sử dụng một đại diện tính năng phong phú, như TBL và
SDT, và tạo ra một phân bố xác suất thẻ
cho mỗi từ, giống như cây quyết định và Markov
mô hình kỹ thuật.
5The ánh xạ từ bài viết để Annotator là trong
tập tin doc / WSJ .wht trên Treebank CDROM.
6The dữ liệu huấn luyện đơn Annotator thu được
bằng cách chiết xuất những bài viết đánh dấu bởi "Maryann" trong
các V.5 CDROM Treebank. Dữ liệu huấn luyện này
không trùng với sự phát triển và thử nghiệm thiết lập được sử dụng
trong bài báo. Các đơn Annotator Set Phát triển
là một phần của các Set phát triển mà cũng
được chú thích bằng "Maryann". Các từ vựng từ
và từ điển thẻ cũng giống như trong đường cơ sở
thí nghiệm.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.