case 2 of Hunt’s method, a test bas

case 2 of Hunt’s method, a test based on a single attribute is chosen for expanding
the current node. The choice of an attribute is normally based on the entropy
gains [Qui93] of the attributes. The entropy of an attribute, calculated from
class distribution information, depicts the classification power of the attribute
by itself. The best attribute is selected as a test for the node expansion.
Highly parallel algorithms for constructing classification decision trees are
desirable for dealing with large data sets in reasonable amount of time. Classi-
fication decision tree construction algorithms have natural concurrency, as once
a node is generated, all of its children in the classification tree can be generated
concurrently. Furthermore, the computation for generating successors of a classi-
fication tree node can also be decomposed by performing data decomposition on
the training data. Nevertheless, parallelization of the algorithms for construction
the classification tree is challenging for the following reasons. First, the shape of
the tree is highly irregular and is determined only at runtime. Furthermore, the
amount of work associated with each node also varies, and is data dependent.
Hence any static allocation scheme is likely to suffer from major load imbalance.
Second, even though the successors of a node can be processed concurrently,
they all use the training data associated with the parent node. If this data is
dynamically partitioned and allocated to different processors that perform com-
putation for different nodes, then there is a high cost for data movements. If the
data is not partitioned appropriately, then performance can be bad due to the
loss of locality.
Several parallel formulations of classification decision tree have been pro-
posed recently [Pea94,GAR96,SAM96,CDG+97,Kuf97,JKK98,SHKS99]. In this
section, we present two basic parallel formulations for the classification decision
tree construction and a hybrid scheme that combines good features of both of
these approaches described in [SHKS99]. Most of other parallel algorithms are
similar in nature to these two basic algorithms, and their characteristics can be
explained using these two basic algorithms. For these parallel formulations, we
focus our presentation for discrete attributes only. The handling of continuous
attributes is discussed separately. In all parallel formulations, we assume that N

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

case 2 of Hunt’s method, a test based on a single attribute is chosen for expandingthe current node. The choice of an attribute is normally based on the entropygains [Qui93] of the attributes. The entropy of an attribute, calculated fromclass distribution information, depicts the classification power of the attributeby itself. The best attribute is selected as a test for the node expansion.Highly parallel algorithms for constructing classification decision trees aredesirable for dealing with large data sets in reasonable amount of time. Classi-fication decision tree construction algorithms have natural concurrency, as oncea node is generated, all of its children in the classification tree can be generatedconcurrently. Furthermore, the computation for generating successors of a classi-fication tree node can also be decomposed by performing data decomposition onthe training data. Nevertheless, parallelization of the algorithms for constructionthe classification tree is challenging for the following reasons. First, the shape ofthe tree is highly irregular and is determined only at runtime. Furthermore, theamount of work associated with each node also varies, and is data dependent.Hence any static allocation scheme is likely to suffer from major load imbalance.Second, even though the successors of a node can be processed concurrently,they all use the training data associated with the parent node. If this data isdynamically partitioned and allocated to different processors that perform com-
putation for different nodes, then there is a high cost for data movements. If the
data is not partitioned appropriately, then performance can be bad due to the
loss of locality.
Several parallel formulations of classification decision tree have been pro-
posed recently [Pea94,GAR96,SAM96,CDG+97,Kuf97,JKK98,SHKS99]. In this
section, we present two basic parallel formulations for the classification decision
tree construction and a hybrid scheme that combines good features of both of
these approaches described in [SHKS99]. Most of other parallel algorithms are
similar in nature to these two basic algorithms, and their characteristics can be
explained using these two basic algorithms. For these parallel formulations, we
focus our presentation for discrete attributes only. The handling of continuous
attributes is discussed separately. In all parallel formulations, we assume that N

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

trường hợp 2 phương pháp Hunt, một thử nghiệm dựa trên một thuộc tính duy nhất được chọn để mở rộng
các nút hiện hành. Sự lựa chọn của một thuộc tính thông thường dựa vào entropy
tăng [Qui93] của các thuộc tính. Entropy của một thuộc tính, tính toán từ
các thông tin phân lớp, mô tả sức mạnh phân loại các thuộc tính
của chính nó. Các thuộc tính tốt nhất được chọn như là một thử nghiệm cho việc mở rộng nút.
Thuật toán cao song song để xây dựng cây quyết định phân loại là
mong muốn để đối phó với các tập dữ liệu lớn trong thời gian hợp lý. Classi-
quyết định fication thuật toán xây dựng cây có đồng thời tự nhiên, như một lần
một nút được tạo ra, tất cả các con của nó trong cây phân loại có thể được tạo ra
đồng thời. Hơn nữa, việc tính toán để tạo ra kế của một classi-
nút cây fication cũng có thể được phân hủy bằng cách thực hiện các dữ liệu phân hủy trên
dữ liệu huấn luyện. Tuy nhiên, song song của các thuật toán xây dựng
cây phân loại là thách thức đối với những lý do sau đây. Đầu tiên, hình dạng của
cây là rất bất thường và chỉ được xác định tại thời gian chạy. Hơn nữa,
số lượng công việc gắn liền với mỗi nút cũng khác nhau, và phụ thuộc dữ liệu.
Do đó bất kỳ đề án phân bổ tĩnh là khả năng bị mất cân bằng tải lớn.
Thứ hai, mặc dù những người thừa kế của một nút có thể được xử lý đồng thời,
tất cả họ đều sử dụng dữ liệu liên kết đào tạo với các nút cha. Nếu dữ liệu này được
tự động phân vùng và phân bổ cho các bộ xử lý khác nhau mà thực hiện đồng
putation cho các nút khác nhau, sau đó có một chi phí cao cho các phong trào dữ liệu. Nếu
dữ liệu không được phân chia một cách thích hợp, sau đó hiệu suất có thể xấu do sự
mất mát của các địa phương.
Nhiều công thức song song của cây quyết định phân loại đã được trình
đặt ra gần đây [Pea94, GAR96, SAM96, CDG + 97, Kuf97, JKK98, SHKS99] . Trong
phần, chúng tôi trình bày hai công thức song song cơ bản cho các quyết định phân loại
xây dựng cây và một sơ đồ lai kết hợp các tính năng tốt của cả hai
cách tiếp cận này được mô tả trong [SHKS99]. Hầu hết các thuật toán song song khác là
bản chất tương tự với hai thuật toán cơ bản và đặc điểm của họ có thể được
giải thích bằng cách sử dụng các thuật toán cơ bản hai. Đối với các công thức này song song, chúng tôi
tập trung trình bày của chúng tôi cho rời rạc chỉ thuộc tính. Việc xử lý liên tục
thuộc tính được thảo luận một cách riêng biệt. Trong tất cả các công thức song song, chúng tôi giả định rằng N

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.