Pari Singh's Profile Photo

Machine Learning | Cross-Modal AI | Safe AI

San Francisco Bay Area

Hi! I am Paridhi Singh. I am a Machine Learning Engineer currently working at Meta, with 5+ years of industry experience in Computer Vision, Generative AI, and 3D scene representation. Passionate about building scalable, reliable, and safe AI systems, I focus on advancing multimodal technologies that seamlessly integrate diverse modalities to drive real-world impact.

I have been fortunate to contribute to the field through patents, peer-reviewed publications, and technical talks at international conferences like CVPR, Women of Silicon Valley, and AI4. My work has earned recognition, including the Top 50 Women of Impact 2025 award and opportunities to collaborate with inspiring researchers and innovators. I am committed to advancing AI systems that seamlessly integrate diverse modalities, align with ethical principles, and create tangible value for society.

Outside work, I enjoy contributing to open-source projects, mentoring aspiring ML engineers, and exploring emerging AI trends. In my downtime, you’ll often find me immersed in nature or hiking scenic trails.

Research Work

Static 3D Scene Modeling with Lang-splats

Static 3D Scene Modeling with Lang-splats

Paridhi Singh, Zaid Tasneem, Tony Yu, Akshat Dave

Ongoing research

Objects as Spatio-Temporal 2.5D Points

Objects as Spatio-Temporal 2.5D Points

Paridhi Singh, Gaurav Singh, Arun Kumar

CVPR(W)-2022

Single stage weakly supervised architecture that learns to detect, track and model objects in 3D using only 2D annotations.

Unseen Object Reasoning with Shared Appearance Cues

Unseen Object Reasoning with Shared Appearance Cues

Paridhi Singh, Arun Kumar

Novel learning paradigm for open-world object reasoning by capturing similarities between object classes and representing objects through shared features, enabling reasoning about unseen objects.

Projects

Project 1

Vehicle Damage Segmentation and 3D Depth Analysis.

Designed and implemented a robust damage segmentation pipeline leveraging large vision models (e.g., transformer-based architectures) to process customer-provided vehicle images with varying zoom levels and perspectives. This approach significantly outperformed traditional CNN-based segmentation methods, achieving substantial improvements in precision and recall metrics, thereby enhancing the baseline segmentation performance for real-world (customer data), noisy datasets.

I've got many more exciting projects and research works from my past to share. Just need a moment to work my web wizardry! So stay tuned!