Equation (6.4) shows that the confi

Equation (6.4) shows that the confidence of rule A ⇒ B can be easily derived from the support counts of A and A ∪ B. That is, once the support counts of A, B, and A ∪ B are found, it is straightforward to derive the corresponding association rules A ⇒ B and B ⇒ A and check whether they are strong. Thus, the problem of mining association rules can be reduced to that of mining frequent itemsets.
In general, association rule mining can be viewed as a two-step process:

1. Find all frequent itemsets: By definition, each of these itemsets will occur at least as frequently as a predetermined minimum support count, min sup.
2. Generate strong association rules from the frequent itemsets: By definition, these rules must satisfy minimum support and minimum confidence.

Additional interestingness measures can be applied for the discovery of correlation relationships between associated items, as will be discussed in Section 6.3. Because the second step is much less costly than the first, the overall performance of mining association rules is determined by the first step.
A major challenge in mining frequent itemsets from a large data set is the fact that such mining often generates a huge number of itemsets satisfying the minimum support (min sup) threshold, especially when min sup is set low. This is because if an itemset is frequent, each of its subsets is frequent as well. A long itemset will contain a combinato- rial number of shorter, frequent sub-itemsets. For example, a frequent itemset of length
100, such as {a1, a2, . . . , a100}, contains .100. = 100 frequent 1-itemsets: {a1}, {a2}, . . . ,
{a100}; .100. frequent 2-itemsets: {a1, a2}, {a1, a3}, . . . , {a99, a100}; and so on. The total number of frequent itemsets that it contains is thus

.100.
1 +

.100.
2

+ · · · +

.100.
100

100
=

− 1 ≈ 1.27 × 10

. (6.5)

This is too huge a number of itemsets for any computer to compute or store. To over- come this difficulty, we introduce the concepts of closed frequent itemset and maximal frequent itemset.
An itemset X is closed in a data set D if there exists no proper super-itemset Y 5 such that Y has the same support count as X in D. An itemset X is a closed frequent itemset in set D if X is both closed and frequent in D. An itemset X is a maximal frequent itemset (or max-itemset) in a data set D if X is frequent, and there exists no super-itemset Y
such that X ⊂ Y and Y is frequent in D.
Let C be the set of closed frequent itemsets for a data set D satisfying a minimum sup- port threshold, min sup. Let M be the set of maximal frequent itemsets for D satisfying min sup. Suppose that we have the support count of each itemset in C and M. Notice that C and its count information can be used to derive the whole set of frequent itemsets.

5Y is a proper super-itemset of X if X is a proper sub-itemset of Y , that is, if X ⊂ Y . In other words, every item of X is contained in Y but there is at least one item of Y that is not in X .

1. Find all frequent itemsets: By definition, each of these itemsets will occur at least as frequently as a predetermined minimum support count, min sup.
2. Generate strong association rules from the frequent itemsets: By definition, these rules must satisfy minimum support and minimum confidence.

Additional interestingness measures can be applied for the discovery of correlation relationships between associated items, as will be discussed in Section 6.3. Because the second step is much less costly than the first, the overall performance of mining association rules is determined by the first step.
A major challenge in mining frequent itemsets from a large data set is the fact that such mining often generates a huge number of itemsets satisfying the minimum support (min sup) threshold, especially when min sup is set low. This is because if an itemset is frequent, each of its subsets is frequent as well. A long itemset will contain a combinato- rial number of shorter, frequent sub-itemsets. For example, a frequent itemset of length
100, such as {a1, a2, . . . , a100}, contains .100. = 100 frequent 1-itemsets: {a1}, {a2}, . . . ,
{a100}; .100. frequent 2-itemsets: {a1, a2}, {a1, a3}, . . . , {a99, a100}; and so on. The total number of frequent itemsets that it contains is thus

.100.
1 +
 
.100.
2

+ · · · +
 
.100.
100

100
=
 
− 1 ≈ 1.27 × 10

. (6.5)

This is too huge a number of itemsets for any computer to compute or store. To over- come this difficulty, we introduce the concepts of closed frequent itemset and maximal frequent itemset.
An itemset X is closed in a data set D if there exists no proper super-itemset Y 5 such that Y has the same support count as X in D. An itemset X is a closed frequent itemset in set D if X is both closed and frequent in D. An itemset X is a maximal frequent itemset (or max-itemset) in a data set D if X is frequent, and there exists no super-itemset Y
such that X ⊂ Y and Y is frequent in D.
Let C be the set of closed frequent itemsets for a data set D satisfying a minimum sup- port threshold, min sup. Let M be the set of maximal frequent itemsets for D satisfying min sup. Suppose that we have the support count of each itemset in C and M. Notice that C and its count information can be used to derive the whole set of frequent itemsets.

5Y is a proper super-itemset of X if X is a proper sub-itemset of Y , that is, if X ⊂ Y . In other words, every item of X is contained in Y but there is at least one item of Y that is not in X .

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

Phương trình (6.4) cho thấy rằng sự tự tin của quy tắc một ⇒ B có thể dễ dàng bắt nguồn từ đếm sự hỗ trợ của một và một u sinh Có nghĩa là, một khi sự hỗ trợ tính số A, B, và một u B được tìm thấy, nó là đơn giản để lấy được tương ứng Hiệp hội quy tắc A ⇒ B và B ⇒ A và kiểm tra cho dù họ rất mạnh. Vì vậy, vấn đề khai thác mỏ Hiệp hội quy định có thể được giảm với các itemsets thường xuyên và khai thác mỏ.Nói chung, association rule mining có thể được xem như là một quá trình hai bước:1. Tìm tất cả các itemsets thường xuyên: theo định nghĩa, mỗi người trong số các itemsets sẽ xảy ra ít thường xuyên như một số hỗ trợ tối thiểu định trước, min sup.2. tạo ra các quy tắc của Hiệp hội mạnh mẽ từ itemsets thường xuyên: theo định nghĩa, những quy tắc này phải đáp ứng hỗ trợ tối thiểu và tối thiểu sự tự tin.Các biện pháp bổ sung interestingness có thể được áp dụng cho việc phát hiện mối quan hệ tương quan giữa các mục liên quan, như sẽ được thảo luận trong phần 6.3 Bởi vì bước thứ hai là ít hơn nhiều tốn kém hơn so với lần đầu tiên, hiệu suất tổng thể của khai thác mỏ Hiệp hội quy định được xác định bởi bước đầu tiên.Một thách thức lớn trong khai thác mỏ itemsets thường xuyên từ một tập dữ liệu lớn là một thực tế rằng khai thác mỏ như vậy thường tạo ra một số lượng lớn các itemsets, đáp ứng các ngưỡng hỗ trợ tối thiểu (min sup), đặc biệt là khi min sup được đặt thành thấp. Điều này là bởi vì nếu một itemset là thường xuyên, mỗi tập con của nó thường xuyên là tốt. Itemset dài sẽ chứa một số combinato-rial ngắn hơn, thường xuyên phụ-itemsets. Ví dụ, thường xuyên itemset chiều dài100, chẳng hạn như {a1, a2,..., a100}, chứa.100. = thường xuyên 1 100-itemsets: {a1}, {a2},...,{a100};.100. thường xuyên 2-itemsets: {a1, a2}, {a1, a3},..., {a99, a100}; và như vậy. Tổng số itemsets thường xuyên mà nó chứa là như vậy .100.1 + .100.2 + · · · + .100.100 100= − 1 ≈ 1.27 × 10 . (6.5) Điều này là quá lớn một số itemsets cho bất kỳ máy tính nào để tính toán hoặc lưu trữ. Đến hơn-đến khó khăn này, chúng tôi giới thiệu các khái niệm đóng cửa thường xuyên itemset và tối đa itemset thường xuyên.Một itemset X đóng cửa trong một tập hợp dữ liệu D nếu có tồn tại không có super-itemset đúng Y 5 như vậy Y đã cùng hỗ trợ đếm như X trong mất Một itemset X là một itemset thường xuyên đóng cửa trong thiết lập D nếu X là đóng cửa và thường xuyên trong mất Một itemset X là một tối đa thường xuyên itemset (hoặc max-itemset) trong một tập hợp dữ liệu D nếu X là thường xuyên, và có tồn tại không có super-itemset Ynhư vậy mà X ⊂ Y và Y là thường xuyên trong mấtGiả sử C là các thiết lập của itemsets thường xuyên đóng cho một tập hợp dữ liệu D đáp ứng một ngưỡng sup-cổng tối thiểu, min sup. Giả sử M là tập hợp tối đa thường xuyên itemsets cho D min sup, đáp ứng. Giả sử rằng chúng tôi có hỗ trợ tính của mỗi itemset trong C và M. thông báo rằng C và thông tin số của nó có thể được sử dụng để lấy được các thiết lập toàn bộ của itemsets thường xuyên.5Y là một siêu thích hợp itemset x nếu X là một sub đúng itemset của Y, đó là, nếu X ⊂ Y. Nói cách khác, mỗi mục X được chứa trong Y nhưng có ít nhất một mục của Y đó không phải là trong X.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

Phương trình (6.4) cho thấy niềm tin của các quy tắc A ⇒ B có thể dễ dàng bắt nguồn từ tính hỗ trợ của A và A ∪ B. Đó là, một khi số lượng hỗ trợ của A, B, và A ∪ B được tìm thấy, nó là đơn giản để lấy được các hiệp hội tương ứng với quy tắc A ⇒ B và B ⇒ A và kiểm tra xem họ rất mạnh. Như vậy, vấn đề của luật kết hợp khai thác khoáng sản có thể giảm xuống mà khai thác các tập phổ biến.
Nói chung, hiệp hội khai thác quy tắc có thể được xem như là một quá trình hai bước:

1. Tìm tất cả các tập phổ biến: Theo định nghĩa, mỗi tập phổ biến sẽ xảy ra ít nhất là thường xuyên như một số hỗ trợ tối thiểu được xác định trước, min sup.
2. Tạo luật kết hợp mạnh mẽ từ các tập phổ biến: Theo định nghĩa, những quy định này phải đáp ứng hỗ trợ tối thiểu và tin cậy tối thiểu.

Pháp lý thú bổ sung có thể được áp dụng cho việc phát hiện ra mối quan hệ tương quan giữa các hạng mục liên quan, như sẽ được thảo luận trong Phần 6.3. Bởi vì bước thứ hai là ít tốn kém hơn so với lần đầu tiên, hiệu suất tổng thể của luật kết hợp khai thác khoáng sản được xác định bởi những bước đầu tiên.
Một thách thức lớn trong khai thác tập phổ biến từ một tập dữ liệu lớn là một thực tế rằng việc khai thác như vậy thường tạo ra một số lượng lớn các tập phổ biến đáp ứng hỗ trợ tối thiểu (min sup) ngưỡng, đặc biệt là khi phút sup được thiết lập thấp. Điều này là bởi vì nếu một tập phổ biến thường xuyên, mỗi tập con của nó là thường xuyên là tốt. Một itemset dài sẽ chứa một số Rial combinato- của ngắn hơn, thường xuyên tiểu tập phổ biến. Ví dụ, một tập phổ biến có chiều dài
100, chẳng hạn như {a1, a2,. . . , A100}, chứa 0,100. = 100 thường xuyên 1-tập phổ biến: {a1}, {a2}. . . ,
{A100}; .100. thường xuyên 2 tập phổ biến: {a1, a2}, {a1, a3}. . . , {A99, A100}; và như vậy. Tổng số tập phổ biến mà nó chứa là do

.100.
1 +

0,100.
2

+ · · · +

0,100.
100

100
=

- 1 ≈ 1,27 × 10

(6.5).

Điều này là quá lớn một số tập phổ biến cho bất kỳ máy tính để tính toán hay lưu trữ. Để trên đến khó khăn này, chúng tôi giới thiệu các khái niệm về tập phổ biến đóng và tập phổ biến tối đại.
Một itemset X được đóng trong một bộ dữ liệu D nếu có tồn tại không thích hợp siêu itemset Y 5 mà Y có tính hỗ trợ tương tự như X D. một itemset X là một tập phổ biến đóng trong tập D nếu X là cả hai đóng cửa và thường xuyên trong D. một itemset X là một tập phổ biến tối đa thường xuyên (hoặc max-itemset) trong một tập dữ liệu D nếu X là thường xuyên, và có tồn tại không có siêu itemset Y
như rằng X ⊂ Y và Y là thường xuyên ở D.
Hãy C là tập các tập phổ biến đóng cho một tập dữ liệu D thỏa mãn ngưỡng sự hỗ trợ tối thiểu, min sup. Gọi M là tập các tập phổ biến tối đa cho D thỏa mãn sup min. Giả sử chúng ta có tính hỗ trợ của mỗi tập phổ biến trong C và M. Chú ý rằng C và thông tin số của nó có thể được sử dụng để lấy được toàn bộ các tập phổ biến.

5y là một siêu tập phổ biến thích hợp của X nếu X là một phụ thích hợp tập phổ biến của Y, có nghĩa là, nếu X ⊂ Y. Nói cách khác, tất cả các mục của X được chứa trong Y nhưng có ít nhất một mục của Y mà không có trong X.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.