5.4.4 Repurposing and ReinterpretationOne of the foundational concepts dịch - 5.4.4 Repurposing and ReinterpretationOne of the foundational concepts Việt làm thế nào để nói

5.4.4 Repurposing and Reinterpretat

5.4.4 Repurposing and Reinterpretation
One of the foundational concepts for the use of data for analytics is the possibility of finding interesting patterns that can lead to actionable insight, and you must keep in mind that any acquired dataset may be used for any potential purpose at any time in the future. However, this strategy of data reuse can also backfire. Repeated copying and repurposing leads to a greater degree of separation between data producer and data consumer. With each successive reuse, the data consumers yet again must reinterpret what the data means. Eventually, any inherent semantics associated with the data when it is created evaporates.
Governance will also mean establishing some limits around the scheme for repurposing. New policies may be necessary when it comes to determining what data to acquire and what to ignore, which concepts to capture and which ones should be trashed, the volume of data to be retained and for how long, and other qualitative data management and custodianship policies.

5.4.5 Data Enrichment and Enhancement
It is hard to consider any need for data governance or quality for large acquired datasets without discussing alternatives for data cleansing and correction. The plain truth is that in general you will have no control over the quality and validity of data that is acquired from outside the organization. Validation rules can be used to score the usability of the data based on end-user requirements, but if those scores are below the level of acceptability and you still want to do the analysis, you basically have these choices:
1. Don’t use the data at all.
2. Use the data in its “unacceptable” state and modulate your users’ expectations in relation to the validity score.
3. Change the data to a more acceptable form.
This choice might not be as drastic as you might think. If the business application requires accuracy and precision in the data, attempting to use unacceptable data will introduce a risk that the results may not be trustworthy. On the other hand, if you are analyzing extremely large datasets for curious and interesting patterns or to identify relationships among many different entities, there is some leeway for executing the process in the presence of a small number of errors. A minimal percentage of data flaws will not significantly skew the results.
As an example, large online retailers want to drive increased sale through relationship analysis, as well as look at sales correlations within sales “market baskets” (the collection of items purchased by an individual at one time). When processing millions of (or orders of magnitude more) transactions a day, a minimal number of inconsistencies, incomplete records, or errors are likely to be irrelevant. However, should incorrect values be an impediment to the analysis and making changes does not significantly alter the data from its original form other than in a positive and expected way, data enhancement and enrichment may be a reasonable alternative. A good example is address standardization. Address locations may be incomplete or even incorrect (e.g., the zip code may be incorrect). Standardizing an address’s format and applying corrections is a consistent way only to improve the data.
The same could be said for linking extracted entities to known identity profiles using algorithms that match identities with high probability. Making that link enhances the analysis through the sharing of profile information for extracted entities. A similar process can be used in connection with our defined reference metadata hierarchies and taxonomies: standardizing references to items or concepts in relation to a taxonomic order lets your application treat cars, automobiles, vans, minivans, SUVs, trucks, and RVs as vehicles, at least for certain analytical purposes.
0/5000
Từ: -
Sang: -
Kết quả (Việt) 1: [Sao chép]
Sao chép!
5.4.4 Repurposing and Reinterpretation
One of the foundational concepts for the use of data for analytics is the possibility of finding interesting patterns that can lead to actionable insight, and you must keep in mind that any acquired dataset may be used for any potential purpose at any time in the future. However, this strategy of data reuse can also backfire. Repeated copying and repurposing leads to a greater degree of separation between data producer and data consumer. With each successive reuse, the data consumers yet again must reinterpret what the data means. Eventually, any inherent semantics associated with the data when it is created evaporates.
Governance will also mean establishing some limits around the scheme for repurposing. New policies may be necessary when it comes to determining what data to acquire and what to ignore, which concepts to capture and which ones should be trashed, the volume of data to be retained and for how long, and other qualitative data management and custodianship policies.

5.4.5 Data Enrichment and Enhancement
It is hard to consider any need for data governance or quality for large acquired datasets without discussing alternatives for data cleansing and correction. The plain truth is that in general you will have no control over the quality and validity of data that is acquired from outside the organization. Validation rules can be used to score the usability of the data based on end-user requirements, but if those scores are below the level of acceptability and you still want to do the analysis, you basically have these choices:
1. Don’t use the data at all.
2. Use the data in its “unacceptable” state and modulate your users’ expectations in relation to the validity score.
3. Change the data to a more acceptable form.
This choice might not be as drastic as you might think. If the business application requires accuracy and precision in the data, attempting to use unacceptable data will introduce a risk that the results may not be trustworthy. On the other hand, if you are analyzing extremely large datasets for curious and interesting patterns or to identify relationships among many different entities, there is some leeway for executing the process in the presence of a small number of errors. A minimal percentage of data flaws will not significantly skew the results.
As an example, large online retailers want to drive increased sale through relationship analysis, as well as look at sales correlations within sales “market baskets” (the collection of items purchased by an individual at one time). When processing millions of (or orders of magnitude more) transactions a day, a minimal number of inconsistencies, incomplete records, or errors are likely to be irrelevant. However, should incorrect values be an impediment to the analysis and making changes does not significantly alter the data from its original form other than in a positive and expected way, data enhancement and enrichment may be a reasonable alternative. A good example is address standardization. Address locations may be incomplete or even incorrect (e.g., the zip code may be incorrect). Standardizing an address’s format and applying corrections is a consistent way only to improve the data.
The same could be said for linking extracted entities to known identity profiles using algorithms that match identities with high probability. Making that link enhances the analysis through the sharing of profile information for extracted entities. A similar process can be used in connection with our defined reference metadata hierarchies and taxonomies: standardizing references to items or concepts in relation to a taxonomic order lets your application treat cars, automobiles, vans, minivans, SUVs, trucks, and RVs as vehicles, at least for certain analytical purposes.
đang được dịch, vui lòng đợi..
Kết quả (Việt) 2:[Sao chép]
Sao chép!
5.4.4 Repurposing and Reinterpretation
One of the foundational concepts for the use of data for analytics is the possibility of finding interesting patterns that can lead to actionable insight, and you must keep in mind that any acquired dataset may be used for any potential purpose at any time in the future. However, this strategy of data reuse can also backfire. Repeated copying and repurposing leads to a greater degree of separation between data producer and data consumer. With each successive reuse, the data consumers yet again must reinterpret what the data means. Eventually, any inherent semantics associated with the data when it is created evaporates.
Governance will also mean establishing some limits around the scheme for repurposing. New policies may be necessary when it comes to determining what data to acquire and what to ignore, which concepts to capture and which ones should be trashed, the volume of data to be retained and for how long, and other qualitative data management and custodianship policies.

5.4.5 Data Enrichment and Enhancement
It is hard to consider any need for data governance or quality for large acquired datasets without discussing alternatives for data cleansing and correction. The plain truth is that in general you will have no control over the quality and validity of data that is acquired from outside the organization. Validation rules can be used to score the usability of the data based on end-user requirements, but if those scores are below the level of acceptability and you still want to do the analysis, you basically have these choices:
1. Don’t use the data at all.
2. Use the data in its “unacceptable” state and modulate your users’ expectations in relation to the validity score.
3. Change the data to a more acceptable form.
This choice might not be as drastic as you might think. If the business application requires accuracy and precision in the data, attempting to use unacceptable data will introduce a risk that the results may not be trustworthy. On the other hand, if you are analyzing extremely large datasets for curious and interesting patterns or to identify relationships among many different entities, there is some leeway for executing the process in the presence of a small number of errors. A minimal percentage of data flaws will not significantly skew the results.
As an example, large online retailers want to drive increased sale through relationship analysis, as well as look at sales correlations within sales “market baskets” (the collection of items purchased by an individual at one time). When processing millions of (or orders of magnitude more) transactions a day, a minimal number of inconsistencies, incomplete records, or errors are likely to be irrelevant. However, should incorrect values be an impediment to the analysis and making changes does not significantly alter the data from its original form other than in a positive and expected way, data enhancement and enrichment may be a reasonable alternative. A good example is address standardization. Address locations may be incomplete or even incorrect (e.g., the zip code may be incorrect). Standardizing an address’s format and applying corrections is a consistent way only to improve the data.
The same could be said for linking extracted entities to known identity profiles using algorithms that match identities with high probability. Making that link enhances the analysis through the sharing of profile information for extracted entities. A similar process can be used in connection with our defined reference metadata hierarchies and taxonomies: standardizing references to items or concepts in relation to a taxonomic order lets your application treat cars, automobiles, vans, minivans, SUVs, trucks, and RVs as vehicles, at least for certain analytical purposes.
đang được dịch, vui lòng đợi..
 
Các ngôn ngữ khác
Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.

Copyright ©2024 I Love Translation. All reserved.

E-mail: