Total: 1
We develop a framework for capturing the instrumentalvalue of data production processes, whichaccounts for two key factors: (a) the context ofthe agent’s decision-making; (b) how much dataor information the buyer already possesses. We"micro-found" our data valuation function by establishingits connection to classic notions of signalsand information design in economics. Wheninstantiated in Bayesian linear regression, ourvalue naturally corresponds to information gain.Applying our proposed data value in Bayesian linearregression for monopoly pricing, we show thatif the seller can fully customize data production,she can extract the first-best revenue (i.e., full surplus)from any population of buyers, i.e., achievingfirst-degree price discrimination. If data canonly be constructed from an existing data pool,this limits the seller’s ability to customize, andachieving first-best revenue becomes generallyimpossible. However, we design a mechanismthat achieves seller revenue at most $\log(\kappa)$ lessthan the first-best, where $\kappa$ is the condition numberassociated with the data matrix. As a corollary,the seller extracts the first-best revenue in themulti-armed bandits special case.