Perception is designed to empower Unity ML-Agents with advanced computer vision capabilities. It allows a standard Unity Camera to generate rich Ground Truth data including 2D Bounding Boxes, Semantic Segmentation masks, and Gray-scale Depth maps.
Perception is specifically engineered for runtime integration with Unity ML-Agents. It wraps these ground truth generators into custom Sensor Components, allowing Agents to directly observe the segmented labeled, and depth world. The observations are serialized and sent to the external Python environment via ML-Agents API, enabling the training of complex vision-based reinforcement learning models. The Python samples is available on GitHub https://github.com/BlueFisher/Perception-for-ML-Agents-Sample/blob/master/Perception_unity_wrapper.ipynb.
Perception is a fork of the now-deprecated com.unity.perception. It strips out the dataset capture (offline data collection) features to focus on lightweight, real-time ground truth generation and has been migrated to support Unity 6 render graph.
Require com.unity.ml-agents >= 4.0.0 (ML-Agents Release 23)