Picking an item in multi-object scenarios can be highly challenging as it involves occlusions and partial views. As a result, the fidelity of object recognition is reduced and the point cloud may not be simply trusted.

This project

  1. proposes a perception pipeline which can detect objects in a given image and predict their existence probabilities.
  2. performs a geometric model matching process to return a certain number of pose hypotheses.
  3. models the problem as a stochastic version of minimum constraint removal (MCR) problem and return safe and effective picking paths for a robotic arm which (1) minimize collision probability (2) maximize the probability of reaching the target object.
  4. perform experiments with real sensing data in diverse multi-object scenarios.


In real-word experiments, we use a robot system where an Azure Kinetic camera is mounted on top of a humanoid Motoman SDA10F robot to enable an overhead view of the objects on the table. The dataset used in the real-world experiment can be downloaded here.