Data Quality

preview_player
Показать описание
Data quality refers to the ability to use a dataset for its intended purpose, requiring
four criteria

availability,
relevance,
clean,
and usability.

Data availability refers to data that is ready for use and up-to-date

Relevance implies, that data should be clear, not confusing and answers the research questions being asked. Irrelevant data is of no use to a data analyst.

Clean, complete datasets are free of errors and have minimal missing entries

Usability is the ease to conduct an analysis to uncover useful findings from the dataset.

Proper formatting and a well-organized codebook will ensure data quality of the highest standards.

For a dataset displaying poor quality a data improvement plan will be needed to make the required adjustments.
Рекомендации по теме