Sampling From the trending hashtags

Sampling From the trending hashtags, we sample 30 distinct hashtags for evaluation. Since our study focuses on trending hashtags that are mapable to entities in Wikipedia, the sampling must cover a sufficient number of “popular” topics that are seen in Wikipedia, and at the same time cover rare topics in the long tail. To do this, we apply
several heuristics in the sampling. First, we only consider hashtags where the lexicon-based linking (Section 3.1) results in at least 20 different entities. Second, we randomly choose hashtags to cover different types of topics (long-running events, breaking events, endogenous hashtags). Instead of inspecting all hashtags in our corpus, we follow Lehmann et al. (2012) and calculate the fraction of tweets published before, during and after the peak. The hashtags are then clustered in this 3-dimensional vector space. Each cluster suggests a group of hashtags with a distinct semantics (Lehmann et al., 2012). We then pick up hashtags randomly from each cluster, resulting in 200 hashtags in total. From this rough sample, three inspectors carefully checked the tweets and chose 30 hashtags where the meanings and hashtag types were certain to the knowledge of the inspectors.

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

Sampling From the trending hashtags, we sample 30 distinct hashtags for evaluation. Since our study focuses on trending hashtags that are mapable to entities in Wikipedia, the sampling must cover a sufficient number of “popular” topics that are seen in Wikipedia, and at the same time cover rare topics in the long tail. To do this, we applyseveral heuristics in the sampling. First, we only consider hashtags where the lexicon-based linking (Section 3.1) results in at least 20 different entities. Second, we randomly choose hashtags to cover different types of topics (long-running events, breaking events, endogenous hashtags). Instead of inspecting all hashtags in our corpus, we follow Lehmann et al. (2012) and calculate the fraction of tweets published before, during and after the peak. The hashtags are then clustered in this 3-dimensional vector space. Each cluster suggests a group of hashtags with a distinct semantics (Lehmann et al., 2012). We then pick up hashtags randomly from each cluster, resulting in 200 hashtags in total. From this rough sample, three inspectors carefully checked the tweets and chose 30 hashtags where the meanings and hashtag types were certain to the knowledge of the inspectors.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

Lấy mẫu Từ hashtags xu hướng, chúng tôi lấy mẫu 30 hashtags biệt để đánh giá. Từ nghiên cứu của chúng tôi tập trung vào xu hướng hashtags có mapable để các thực thể trong Wikipedia, lấy mẫu phải bao gồm một số lượng đủ các chủ đề "phổ biến" được nhìn thấy trong Wikipedia, đồng thời bao gồm các chủ đề hiếm ở đuôi dài. Để làm được điều này, chúng tôi áp dụng
một số heuristics trong lấy mẫu. Đầu tiên, chúng ta chỉ xem xét hashtags nơi từ vựng dựa trên liên kết (Phần 3.1) kết quả trong ít nhất 20 đơn vị khác nhau. Thứ hai, chúng tôi chọn ngẫu nhiên hashtags để trang trải các loại khác nhau của các chủ đề (sự kiện kéo dài, sự kiện phá vỡ, hashtags nội sinh). Thay vì kiểm tra tất cả các hashtags ở corpus của chúng tôi, chúng tôi theo Lehmann et al. (2012) và tính toán phần của tweets xuất bản trước, trong và sau đỉnh. Các hashtags sau đó được gom lại trong một không gian vector 3 chiều này. Mỗi cụm cho một nhóm các hashtags với một ngữ nghĩa khác nhau (Lehmann et al, 2012.). Sau đó chúng tôi nhận hashtags ngẫu nhiên từ mỗi cụm, kết quả là 200 hashtags trong tổng số. Từ mẫu thô này, ba thanh tra kiểm tra cẩn thận các tweets và chọn 30 hashtags nơi những ý nghĩa và các loại hashtag là nhất định để các kiến thức của thanh tra.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.