Cost-benefit analysis of data intelligence
Abstract
All data intelligence processes are designed for processing a finite amount of data within a time period. In practice, they all encounter
some difficulties, such as the lack of adequate techniques for extracting meaningful information from raw data; incomplete, incorrect
or noisy data; biases encoded in computer algorithms or biases of human analysts; lack of computational resources or human resources; urgency in
making a decision; and so on. While there is a great enthusiasm to develop automated data intelligence processes, it is also known that
many of such processes may suffer from the phenomenon of data processing inequality, which places a fundamental doubt on the credibility of these
processes. In this talk, the speaker will discuss the recent development of an information-theoretic measure (by Chen and Golan) for optimizing
the cost-benefit ratio of a data intelligence process, and will illustrate its applicability using examples of data analysis and
visualization processes including some in bioinformatics.