Probabilistic Circuits (PCs) offer a unified framework for tractable probability distributions, enabling efficient probabilistic inference through structured computation graphs. Researchers are advancing their speed and scalability via GPU parallelization, tensorized designs, and even custom hardware like DAG Processing Units. With applications ranging from explainability and data compression to neuro-symbolic AI and large language model detoxification, PCs are emerging as a powerful foundation for the next wave of efficient, interpretable AI.Probabilistic Circuits (PCs) offer a unified framework for tractable probability distributions, enabling efficient probabilistic inference through structured computation graphs. Researchers are advancing their speed and scalability via GPU parallelization, tensorized designs, and even custom hardware like DAG Processing Units. With applications ranging from explainability and data compression to neuro-symbolic AI and large language model detoxification, PCs are emerging as a powerful foundation for the next wave of efficient, interpretable AI.

Why Researchers Are Betting on PCs to Power the Next Wave of AI

4 min read

Abstract and 1. Introduction

  1. Preliminaries and Related Work

  2. Key Bottlenecks in PC Parallelization

  3. Harnessing Block-Based PC Parallelization

    4.1. Fully Connected Sum Layers

    4.2. Generalizing To Practical Sum Layers

    4.3. Efficient Implementations by Compiling PC Layers

    4.4. Analysis: IO and Computation Overhead

  4. Optimizing Backpropagation with PC Flows

  5. Experiments

    6.1. Faster Models with PyJuice

    6.2. Better PCs At Scale

    6.3. Benchmarking Existing PCs

  6. Conclusion, Acknowledgements, Impact Statement, and References

A. Algorithm Details

B. Additional Technical Details

C. Experimental Details

D. Additional Experiments

\

2. Preliminaries and Related Work

Many probabilistic inference tasks can be cast into computing sums of products. By viewing them from a computation graph standpoint, PCs provide a unified perspective on many bespoke representations of tractable probability distributions, including Arithmetic Circuits (Darwiche, 2002; 2003), Sum-Product Networks (Poon & Domingos, 2011), Cutset Networks (Rahman et al., 2014), and Hidden Markov Models (Rabiner & Juang, 1986). Specifically, PCs define distributions with computation graphs consisting of sum and product operations, as elaborated below.

\

\ The key to guaranteeing exact and efficient computation of various probabilistic queries is to impose proper structural constraints on the DAG of the PC. As an example, with smoothness and decomposability (Poon & Domingos, 2011), computing any marginal probability amounts to a forward pass (children before parents) following Equation (1), with the only exception that we set the value of input nodes defined on marginalized variables to be 1. Please refer to Choi et al. (2020) for a comprehensive overview of different structural constraints and what queries they enable.

\

\ For example, Peharz et al. (2020a) demonstrate how the above parameter gradients can be used to apply ExpectationMaximization (EM) updates, and Vergari et al. (2021) elaborates how the forward pass can be used to compute various probabilistic and information-theoretic queries when coupled with PC structure transformation algorithms. Therefore, the speed and memory efficiency of these two procedures largely determine the overall efficiency of PCs.

\ Figure 1. Layering a PC by grouping nodes with the same topological depth (as indicated by the colors) into disjoint subsets. Both the forward and the backward computation can be carried out independently on nodes within the same layer.

\ Related work on accelerating PCs. There has been a great amount of effort put into speeding up training and inference for PCs. One of the initial attempts performs nodebased computations on both CPUs (Lowd & Rooshenas, 2015) and GPUs (Pronobis et al., 2017; Molina et al., 2019), i.e., by computing the outputs for a mini-batch of inputs (data) recursively for every node. Despite its simplicity, it fails to fully exploit the parallel computation capability possessed by modern GPUs since it can only parallelize over a batch of samples. This problem is mitigated by also parallelizing topologically independent nodes (Peharz et al., 2020a; Dang et al., 2021). Specifically, a PC is chunked into topological layers, where nodes in the same layer can be computed in parallel. This leads to 1-2 orders of magnitude speedup compared to node-based computation.

\ The regularity of edge connection patterns is another key factor influencing the design choices. Specifically, EiNets (Peharz et al., 2020a) leverage off-the-shelf Einsum operations to parallelize dense PCs where every layer contains groups of densely connected sum and product/input nodes. Mari et al. (2023) generalize the notion of dense PCs to tensorized PCs, which greatly expands the scope of EiNets. Dang et al. (2021) instead focus on speeding up sparse PCs, where different nodes could have drastically different numbers of edges. They use custom CUDA kernels to balance the workload of different GPU threads and achieve decent speedup on both sparse and dense PCs.

\ Another thread of work focuses on designing computation hardware that is more suitable for PCs. Specifically, Shah et al. (2021) propose DAG Processing Units (DPUs) that can efficiently traverse sparse PCs, Dadu et al. (2019) introduce an indirect read reorder-buffer to improve the efficiency of data-dependent memory accesses in PCs, and Yao et al. (2023) use addition-as-int multiplications to significantly improve the energy efficiency of PC inference algorithms.

\ Figure 2. Runtime breakdown of the feedforward pass of a PC with ∼150M edges. Both the IO and the computation overhead of the sum layers are significantly larger than the total runtime of product layers. Detailed configurations of the PC are shown in the table.

\ Applications of PCs. PCs have been applied to many domains such as explainability and causality (Correia et al., 2020; Wang & Kwiatkowska, 2023), graph link prediction (Loconte et al., 2023), lossless data compression (Liu et al., 2022), neuro-symbolic AI (Xu et al., 2018; Manhaeve et al., 2018; Ahmed et al., 2022a;b), gradient estimation (Ahmed et al., 2023b), graph neural networks rewiring (Qian et al., 2023), and even large language model detoxification (Ahmed et al., 2023a).

\

:::info Authors:

(1) Anji Liu, Department of Computer Science, University of California, Los Angeles, USA ([email protected]);

(2) Kareem Ahmed, Department of Computer Science, University of California, Los Angeles, USA;

(3) Guy Van den Broeck, Department of Computer Science, University of California, Los Angeles, USA;

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
NodeAI Logo
NodeAI Price(GPU)
$0.02607
$0.02607$0.02607
-4.71%
USD
NodeAI (GPU) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Gold Hits $3,700 as Sprott’s Wong Says Dollar’s Store-of-Value Crown May Slip

Gold Hits $3,700 as Sprott’s Wong Says Dollar’s Store-of-Value Crown May Slip

The post Gold Hits $3,700 as Sprott’s Wong Says Dollar’s Store-of-Value Crown May Slip appeared on BitcoinEthereumNews.com. Gold is strutting its way into record territory, smashing through $3,700 an ounce Wednesday morning, as Sprott Asset Management strategist Paul Wong says the yellow metal may finally snatch the dollar’s most coveted role: store of value. Wong Warns: Fiscal Dominance Puts U.S. Dollar on Notice, Gold on Top Gold prices eased slightly to $3,678.9 […] Source: https://news.bitcoin.com/gold-hits-3700-as-sprotts-wong-says-dollars-store-of-value-crown-may-slip/
Share
BitcoinEthereumNews2025/09/18 00:33
Verimatrix: Sale of Extended Threat Defense Assets (Mobile Application Protection) to Guardsquare

Verimatrix: Sale of Extended Threat Defense Assets (Mobile Application Protection) to Guardsquare

Completion of the sale of XTD assets (code and mobile application protection), including a portfolio of patents and a team of experts. The Group is refocusing on
Share
AI Journal2026/02/06 00:49
UK crypto holders brace for FCA’s expanded regulatory reach

UK crypto holders brace for FCA’s expanded regulatory reach

The post UK crypto holders brace for FCA’s expanded regulatory reach appeared on BitcoinEthereumNews.com. British crypto holders may soon face a very different landscape as the Financial Conduct Authority (FCA) moves to expand its regulatory reach in the industry. A new consultation paper outlines how the watchdog intends to apply its rulebook to crypto firms, shaping everything from asset safeguarding to trading platform operation. According to the financial regulator, these proposals would translate into clearer protections for retail investors and stricter oversight of crypto firms. UK FCA plans Until now, UK crypto users mostly encountered the FCA through rules on promotions and anti-money laundering checks. The consultation paper goes much further. It proposes direct oversight of stablecoin issuers, custodians, and crypto-asset trading platforms (CATPs). For investors, that means the wallets, exchanges, and coins they rely on could soon be subject to the same governance and resilience standards as traditional financial institutions. The regulator has also clarified that firms need official authorization before serving customers. This condition should, in theory, reduce the risk of sudden platform failures or unclear accountability. David Geale, the FCA’s executive director of payments and digital finance, said the proposals are designed to strike a balance between innovation and protection. He explained: “We want to develop a sustainable and competitive crypto sector – balancing innovation, market integrity and trust.” Geale noted that while the rules will not eliminate investment risks, they will create consistent standards, helping consumers understand what to expect from registered firms. Why does this matter for crypto holders? The UK regulatory framework shift would provide safer custody of assets, better disclosure of risks, and clearer recourse if something goes wrong. However, the regulator was also frank in its submission, arguing that no rulebook can eliminate the volatility or inherent risks of holding digital assets. Instead, the focus is on ensuring that when consumers choose to invest, they do…
Share
BitcoinEthereumNews2025/09/17 23:52