A Framework for Graph-Conditioned Hierarchical Shapley Attribution in Patent Valuation

2026-06-01 • Computer Science and Game Theory

Computer Science and Game TheoryArtificial Intelligence

AI summaryⓘ

The authors tackle the hard problem of figuring out how much money one patent contributes to a product that includes thousands of patents. They introduce PatentXAI, a method that uses ideas from explainable AI and game theory (specifically the Shapley value) to fairly share profits among patents. To make calculations manageable, they focus only on a relevant subset of patents connected in a knowledge graph, called the Markov Blanket. Their tests show this approach is much faster and reasonably accurate compared to exact methods. The authors note that estimating the actual profit linked to any group of patents remains a challenge and suggest next steps using public patent and technology datasets.

Patent valuationShapley valueExplainable AIMarkov BlanketKnowledge graphConditional independenceCoalition game theoryPatent economicsProfit allocationMonte Carlo simulation

Authors

Joy Bose

Abstract

Estimating the economic contribution of a single patent inside a product that embodies tens of thousands of patents is a long-standing unsolved problem in intellectual property economics. We propose PatentXAI, a framework that treats patent valuation as a problem of explainable AI: given a characteristic function v(S) encoding the revenue achievable by patent subset S, a patent's Shapley value measures its fair share of product profit in a way that satisfies efficiency, symmetry, dummy, and additivity. To make computation tractable we restrict each patent's coalition to its Markov Blanket inside a knowledge graph, grounded in the C-SVE conditional independence theorem (Li et al., 2020). Scaling experiments from n=12 to n=100 patents using Pareto-distributed coverage graphs report median Markov Blanket size of 32.9 percent of n at n=100, with 90th-percentile blanket size of 55.2 percent of n, and runtime of 10 milliseconds per patent. Difference against exact ground truth at n=12 is 0.088; difference against a high-sample Monte Carlo reference at n=100 is 0.062 plus or minus 0.003. A dense-component experiment shows that when 80 percent of patents share one component, the blanket correctly expands to cover that dense cluster, and the difference versus reference falls to 0.039 because the pooled computation becomes more accurate on homogeneous portfolios. Profit allocation proceeds hierarchically: exact Shapley distributes total profit among macro-components, then centrality-weighted Shapley distributes each component budget among covering patents. Estimating v(S) from real data is the primary open problem; we distinguish this from the computational contribution and outline a concrete roadmap for empirical validation using public ETSI, USPTO, and Lens.org datasets.

View PDFOpen arXiv