3.7.1 Vector Space ModelMany tracea

3.7.1 Vector Space Model
Many traceability recovery techniques use VSM as the base algorithm [3], [28], [29]. In VSM, documents are represented as vector in the space of all the terms. Different term weighting schemes can be used to construct these vectors. We use the standard TF=IDF weighting scheme [28]: A document is a vector of TF=IDF weights. TF is often called the local weight. The most frequent terms will have more weight in TF, but this by itself does not mean that they are important terms. The inverse document frequency, IDF, of a term is calculated to measure the global weight of a terms and is computed as IDF ¼ log2ð jd:jtDi2jdjÞ. Then, TF=IDF is defined as
ni;j jDj ðTF=IDFÞi;j ¼ P nk;j log2 j d : ti 2 dj ;
k
where ni;j are the occurrences of a term ti in document dj, Pk nk;j is the sum of the occurrences of all the terms in document dj, jDj is the total number of documents d in the corpus, and jd : ti 2 dj is the number of documents in which the term ti appears.
Once documents are represented as vectors of terms in a VSM, traceability links are created between every two documents with their own similarity value depending on each pair of documents, e.g., a requirement and a class. The similarity between two documents is measured by the positive cosine of the angle between their corresponding vectors (because the similarity between two documents cannot be negative). The ranked list of recovered links and a similarity threshold are used to divide links into a set of candidate links to be manually verified [3].

3.7.1 Vector Space Model
Many traceability recovery techniques use VSM as the base algorithm [3], [28], [29]. In VSM, documents are represented as vector in the space of all the terms. Different term weighting schemes can be used to construct these vectors. We use the standard TF=IDF weighting scheme [28]: A document is a vector of TF=IDF weights. TF is often called the local weight. The most frequent terms will have more weight in TF, but this by itself does not mean that they are important terms. The inverse document frequency, IDF, of a term is calculated to measure the global weight of a terms and is computed as IDF ¼ log2ð jd:jtDi2jdjÞ. Then, TF=IDF is defined as
ni;j jDj  ðTF=IDFÞi;j ¼ P nk;j log2 j d : ti 2 dj ;
k
where ni;j are the occurrences of a term ti in document dj, Pk nk;j is the sum of the occurrences of all the terms in document dj, jDj is the total number of documents d in the corpus, and jd : ti 2 dj is the number of documents in which the term ti appears.
Once documents are represented as vectors of terms in a VSM, traceability links are created between every two documents with their own similarity value depending on each pair of documents, e.g., a requirement and a class. The similarity between two documents is measured by the positive cosine of the angle between their corresponding vectors (because the similarity between two documents cannot be negative). The ranked list of recovered links and a similarity threshold are used to divide links into a set of candidate links to be manually verified [3].

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

3.7.1 Vector Space ModelMany traceability recovery techniques use VSM as the base algorithm [3], [28], [29]. In VSM, documents are represented as vector in the space of all the terms. Different term weighting schemes can be used to construct these vectors. We use the standard TF=IDF weighting scheme [28]: A document is a vector of TF=IDF weights. TF is often called the local weight. The most frequent terms will have more weight in TF, but this by itself does not mean that they are important terms. The inverse document frequency, IDF, of a term is calculated to measure the global weight of a terms and is computed as IDF ¼ log2ð jd:jtDi2jdjÞ. Then, TF=IDF is defined asni;j jDj ðTF=IDFÞi;j ¼ P nk;j log2 j d : ti 2 dj ;kwhere ni;j are the occurrences of a term ti in document dj, Pk nk;j is the sum of the occurrences of all the terms in document dj, jDj is the total number of documents d in the corpus, and jd : ti 2 dj is the number of documents in which the term ti appears.Once documents are represented as vectors of terms in a VSM, traceability links are created between every two documents with their own similarity value depending on each pair of documents, e.g., a requirement and a class. The similarity between two documents is measured by the positive cosine of the angle between their corresponding vectors (because the similarity between two documents cannot be negative). The ranked list of recovered links and a similarity threshold are used to divide links into a set of candidate links to be manually verified [3].

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

3.7.1 Vector Space Mô hình
kỹ thuật phục hồi Nhiều truy xuất nguồn gốc sử dụng VSM như các thuật toán cơ bản [3], [28], [29]. Trong VSM, tài liệu được biểu diễn như là vector trong không gian của tất cả các điều khoản. Đề án trọng khác nhau hạn có thể được sử dụng để xây dựng các vectơ. Chúng tôi sử dụng các tiêu chuẩn TF = IDF trọng án [28]: Một tài liệu là một vector của TF = IDF trọng. TF thường được gọi là trọng lượng của địa phương. Các thuật ngữ thường gặp nhất sẽ có trọng lượng hơn trong TF, nhưng điều này tự nó không có nghĩa rằng họ là những thuật ngữ quan trọng. Các tần số tài liệu nghịch đảo, IDF, một thuật ngữ được tính toán để đo trọng lượng toàn cầu của một điều khoản và được tính như IDF ¼ log2ð jd: jtDi2jdjÞ. Sau đó, TF = IDF được định nghĩa là
ni; j jDj DTF = IDFÞi; j ¼ P nk; j log2 jd: ti 2 dj;
k
nơi ni; j là lần xuất hiện của một ti hạn trong tài liệu dj, nk Pk; j là tổng các lần xuất hiện của tất cả các điều khoản trong tài liệu dj, jDj là tổng số các văn bản d trong corpus, và JD: ti 2 dj là số lượng tài liệu trong đó ti hạn xuất hiện.
Một khi tài liệu được biểu diễn như là vectơ điều khoản trong một VSM, liên kết truy xuất nguồn gốc được tạo ra giữa hai văn bản có giá trị tương tự của họ phụ thuộc vào mỗi cặp tài liệu, ví dụ như, một yêu cầu và một lớp học. Sự giống nhau giữa hai tài liệu được đo bằng tích cực cosin của góc giữa vector tương ứng của họ (vì sự giống nhau giữa hai tài liệu không thể phủ định). Danh sách xếp hạng các liên kết phục hồi và một ngưỡng tương tự được sử dụng để phân chia liên kết thành một tập hợp các liên kết ứng cử viên phải được tự xác minh [3].

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.