Differentiable Packing of Irregular 3D Objects with Adaptive Container Estimation

2026-06-15 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionGraphicsMachine Learning

AI summaryⓘ

The authors developed a new approach to pack objects efficiently by adjusting both the container size and the positions of the objects all at once, instead of tuning one dimension manually. They use a special method that calculates how objects fit together using 3D shapes and a smart way to gradually shrink the container size. This method runs much faster than traditional loops and does not rely on complex physics simulations. Their technique results in containers that are noticeably smaller compared to other methods, and it works quickly on a regular GPU.

differentiable packingobject pose optimizationcontainer size optimizationtriangle meshesaxis-aligned bounding boxgradient-based optimizationtensor broadcastingoverlap losssimulated annealingDBLF

Authors

Palak Gupta, Shanmuganathan Raman

Abstract

Most existing approaches either fix the container in advance or optimize only a single container dimension through an outer search loop, leaving the remaining dimensions as a manual tuning problem. We present a differentiable packing framework that jointly optimizes all 6N object pose parameters and all three container side lengths inside a single gradient-based loop. The formulation combines six physics-inspired, differentiable loss terms computed directly on triangle meshes through axis-aligned bounding-box proxies. An adaptive squeezing mechanism periodically tightens the container whenever the overlap loss falls below a pair-count-scaled threshold, producing a large initial drop in container volume, followed by small refinements. All pairwise computations are written in tensor-broadcasting form, giving a 3.4 to 54 times speedup over a reference loop-based implementation. The pipeline is implemented in Python and PyTorch, with no physics engine, FFT library, or convex decomposition. On multiple object categories, the method produces containers that are 11 to 32 percent smaller than time-matched DBLF and simulated-annealing baselines at N =100, while running in under 4 minutes per instance on a single consumer GPU.

View PDFOpen arXiv