Dataset There is no standard benchm

Dataset There is no standard benchmark for our problem, since available datasets on microblog annotation (such as the Microposts challenge (Basave et al., 2014)) do not have global statistics, so we cannot identify the trending hashtags. Therefore,
we created our own dataset. We used the Twitter API to collect from the public stream a sample of 500, 551, 041 tweets from January to April 2014. We removed hashtags that were adopted by less than 500 users, having no letters, or having characters repeated more than 4 times (e.g., ‘#oooommgg’). We identified trending hashtags by computing the daily time series of hashtag tweet counts, and removing those of which the time series’ variance score is less than 900. To identify the hashtag burst time period T , we compute the outlier fraction (Lehmann et al., 2012) for each hashtag h and day t: pt(h=|nt−nb|max (nb,nmin), where nt is the number of tweets containing h, nb is the median value of nt over all points in a 2-month time window centered on t, and nmin = 10 is the threshold to filter low activity hashtags. The hashtag is skipped if its
highest outlier fraction score is less than 15. Finally, we define the burst time period of a trending hashtag as the time window of size w, centered at day t 0 with the highest p t0(h).

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

Dataset There is no standard benchmark for our problem, since available datasets on microblog annotation (such as the Microposts challenge (Basave et al., 2014)) do not have global statistics, so we cannot identify the trending hashtags. Therefore,we created our own dataset. We used the Twitter API to collect from the public stream a sample of 500, 551, 041 tweets from January to April 2014. We removed hashtags that were adopted by less than 500 users, having no letters, or having characters repeated more than 4 times (e.g., ‘#oooommgg’). We identified trending hashtags by computing the daily time series of hashtag tweet counts, and removing those of which the time series’ variance score is less than 900. To identify the hashtag burst time period T , we compute the outlier fraction (Lehmann et al., 2012) for each hashtag h and day t: pt(h=|nt−nb|max (nb,nmin), where nt is the number of tweets containing h, nb is the median value of nt over all points in a 2-month time window centered on t, and nmin = 10 is the threshold to filter low activity hashtags. The hashtag is skipped if itshighest outlier fraction score is less than 15. Finally, we define the burst time period of a trending hashtag as the time window of size w, centered at day t 0 with the highest p t0(h).

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

Dataset Không có benchmark tiêu chuẩn cho vấn đề của chúng tôi, kể từ khi bộ dữ liệu có sẵn trên microblog chú thích (chẳng hạn như các thách thức Microposts (Basave et al., 2014)) không có số liệu thống kê toàn cầu, vì vậy chúng tôi không thể xác định được xu hướng hashtags. Do đó,
chúng tôi tạo ra bộ dữ liệu của chúng tôi. Chúng tôi sử dụng các API Twitter để thu thập từ các luồng công một mẫu 500, 551, 041 tweets từ tháng một-Tháng Tư năm 2014. Chúng tôi loại bỏ hashtags đã được thông qua bởi ít hơn 500 người sử dụng, không có chữ, hoặc có ký tự lặp đi lặp lại nhiều hơn 4 lần (ví dụ, '#oooommgg'). Chúng tôi xác định xu hướng hashtags bằng cách tính toán các chuỗi thời gian hàng ngày của tội hashtag tweet, và loại bỏ những người mà các chuỗi thời gian 'số phương sai nhỏ hơn 900. Để xác định các hashtag vỡ khoảng thời gian T, chúng tôi tính toán phần outlier (Lehmann et al. , 2012) cho mỗi h hashtag và ngày t: pt (h = | nt-nb | max (nb, nmin), nơi mà nt là số tweets có chứa h, nb là giá trị trung bình của các nt trên tất cả các điểm trong 2 cửa sổ thời gian -month trung vào t, và nmin = 10 là ngưỡng để lọc hashtags hoạt động thấp. Các hashtag được bỏ qua nếu nó
điểm outlier phần cao nhất là ít hơn 15. Cuối cùng, chúng tôi xác định khoảng thời gian bùng nổ của một hashtag xu hướng như cửa sổ thời gian của kích thước w, trung tâm tại ngày t 0 với p t0 cao nhất (h).

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.