PROBABILISTIC LANGUAGE PROCESSINGIn

PROBABILISTIC LANGUAGE PROCESSING
In which we see how simple, statistically trained language models can be used to process collections of millions of words, rather than just single sentences.
In Chapter 22, we saw how an agent could communicate with another agent (human or software), using utterances in a common language. Complete syntactic and semantic analysis of the utterances is necessary to extract the full meaning of the utterances, and is possible because the utterances are short and restricted to a limited domain.
(CORPUS-BASED) In this chapter, we consider the corpus-based approach to language understanding. A corpus (plural corpora) is a large collection of text, such as the billions of pages that make up the World Wide Web. The text is written by and for humans, and the task of the software is to make it easier for the human to find the right information. This approach implies the use of statistics and learning to take advantage of the corpus, and it usually entails probabilistic language models that can be learned from data and that are simpler than the augmented DCGs of Chapter 22. For most tasks, the volume of data more than makes up for the simpler language model. We will look at three specific tasks: information retrieval (Section 23.2), information extraction (Section 23.3), and machine translation (Section 23.4). But first we present an overview of probabilistic language models.

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

PROBABILISTIC LANGUAGE PROCESSINGIn which we see how simple, statistically trained language models can be used to process collections of millions of words, rather than just single sentences.In Chapter 22, we saw how an agent could communicate with another agent (human or software), using utterances in a common language. Complete syntactic and semantic analysis of the utterances is necessary to extract the full meaning of the utterances, and is possible because the utterances are short and restricted to a limited domain.(CORPUS-BASED) In this chapter, we consider the corpus-based approach to language understanding. A corpus (plural corpora) is a large collection of text, such as the billions of pages that make up the World Wide Web. The text is written by and for humans, and the task of the software is to make it easier for the human to find the right information. This approach implies the use of statistics and learning to take advantage of the corpus, and it usually entails probabilistic language models that can be learned from data and that are simpler than the augmented DCGs of Chapter 22. For most tasks, the volume of data more than makes up for the simpler language model. We will look at three specific tasks: information retrieval (Section 23.2), information extraction (Section 23.3), and machine translation (Section 23.4). But first we present an overview of probabilistic language models.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

Xác suất NGÔN NGỮ CHẾ
Trong đó chúng ta thấy cách đơn giản, mô hình ngôn ngữ đã được thống kê có thể được sử dụng để xử lý các bộ sưu tập của hàng triệu từ, chứ không phải chỉ là câu duy nhất.
Trong chương 22, chúng ta thấy làm thế nào một tác nhân có thể giao tiếp với các đại lý khác (người hoặc phần mềm) , sử dụng những phát biểu trong một ngôn ngữ chung. Toàn bộ phân tích cú pháp và ngữ nghĩa của lời phát biểu là cần thiết để trích xuất ý nghĩa đầy những lời lẽ, và có thể vì các lời phát biểu ngắn gọn và giới hạn trong một lĩnh vực hạn chế.
(CORPUS-DỰA) Trong chương này, chúng ta xem xét cách tiếp cận corpus dựa trên để hiểu biết ngôn ngữ. Một corpus (số nhiều corpora) là một bộ sưu tập lớn các văn bản, chẳng hạn như hàng tỷ các trang tạo nên World Wide Web. Văn bản được viết bởi và cho con người, và nhiệm vụ của phần mềm là để làm cho nó dễ dàng hơn cho con người để tìm ra thông tin đúng. Cách tiếp cận này bao hàm việc sử dụng số liệu thống kê và học tập để tận dụng lợi thế của corpus, và nó thường đòi hỏi mô hình ngôn ngữ xác suất có thể học được từ dữ liệu và đó là đơn giản hơn so với DCG augmented của Chương 22. Đối với hầu hết các nhiệm vụ, khối lượng dữ liệu nhiều hơn hơn làm cho các mô hình ngôn ngữ đơn giản. Chúng tôi sẽ xem xét ba nhiệm vụ cụ thể: tìm kiếm thông tin (Mục 23.2), khai thác thông tin (Mục 23.3), và bản dịch máy (mục 23.4). Nhưng trước tiên chúng tôi trình bày tổng quan về mô hình ngôn ngữ xác suất.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.