Computer Vision Embedded Engineer
Computer Vision Embedded Engineer
Vimaan Robotics is a privately held technology company, founded in 2017 and headquartered in Silicon Valley. Vimaan is driving a paradigm shift in the way computer vision is enabled and leveraged for inventory management in the supply chain and logistics industry. Vimaan’s proprietary SaaS based end-to-end solution provides comprehensive and real-time tracking of inventory movement and status within the
warehouse; seamlessly integrates into existing legacy workflows and ecosystems; enables full autonomy and scalability; generates rich, actionable data for supply chain operators; and delivers dramatic monetary returns through cost savings and customer satisfaction improvement. In the physical goods industries that require significant investments to install and maintain modern supply chains, Vimaan’s solution will ultimately enable its customers to leverage bleeding edge technology solutions to compete with other fully integrated e-commerce players.
The company’s founders have a track record of founding and successfully exiting multiple companies. Vimaan is planning to soon emerge from stealth mode but has already raised $25M in venture capital from blue chip Silicon Valley investors and has engagements with multiple Fortune 500 customers. The company has over 70 employees in various locations in the US and one location in Bengaluru, India.
About the Role
As company is scaling its products across various customer facilities, you as Module Lead will be leading the optimization and deployment of deep-learning models. The optimization part would be covering the entire pipeline, i.e. including pre-processing, inferencing and post-processing. The optimization effort can range from exploring various optimization frameworks to model pruning. You would be taking models from developers of various products and be responsible for optimizing and then deploying the optimized models in production environment whilst ensuring the accuracy of the model remains same. The target deployment devices can range from server to edge devices but would be known apriori and hence the optimization strategy would be changing accordingly. Optimization target can involve GPU memory consumption, model size, inference pipeline speed or power consumption.
- Helping plan each model’s optimization. Maximize performance out of heterogenous systems in a principled manner.
- Detailed profiling of each model in the current inferencing pipeline by following a structured approach.
- Identifying the bottlenecks basis profiling of the model. Knowledge of tools such as NVidia Nsight for bottleneck identification is a must.
- Implementing optimization techniques to improve performance. Optimization targets can be any one of multiple of these targets – inference speed, GPU accelerated pre-processing or post-processing, power consumption, model size, GPU memory usage.
- Detailed profiling of the optimized model
- Performing comparative analysis to measure the performance gain
- Containerizing optimized models to run in isolated environments
- Deploying optimized models on the target device and ensuring performance is as per expectation
- Documenting the entire approach for reproducibility
- Training other team members with the relevant skills and sharing best practices within the team
- Proven work experience in relevant domain with a track record of successfully deploying models in production
- Detailed knowledge and hands-on experience with popular optimization frameworks like TensorRT
- Knowledge of various DL architectures to determine best choice for given CV problem
- Hands-on experience with programming language like C/C++ and Python
- Hands-on experience with deep-learning frameworks like TensorFlow, PyTorch
- Prior hands-on experience with model pruning, graph compilation and quantization techniques
- Prior experience on optimizing models specifically for edge devices
- Prior experience with NVidia Jetson devices family is a must
- Prior experience on porting Python models to C is a huge plus
- By nature, be investigative – curious, methodical, rational, analytical, and logical with ability to work under pressure and short timelines.
- Target platforms:
- Intel – OpenVINO – CPU, Integrated GPU, VPU, FPGA
- NVIDIA – TensorRT – GPU, Jetson family
- Cross compiler – Apache TVM
- Edge device specific optimizations – TensorFlow Lite
- Experience in optimization of model training is a plus. Hands-on experience with techniques like mixed precision training and model distillation will be considered as a huge plus