This course helps you to apply deep learning architectures to computer vision tasks. You'll learn to combine CNN and RNN networks to build an automatic image captioning application. The course covers advanced CNN architectures, including region-based CNNs like Faster R-CNN for localized object recognition and the YOLO model for multi-object detection.
The course also discusses hyperparameters and their tuning and includes an optional section on attention mechanisms. The final project involves training a CNN-RNN model to generate captions for images, focusing on implementing an effective RNN decoder for a CNN encoder.
💲 Paid to Audit
🕒 Approx. 1 Month
✏️ Advanced Level
🧾 Certificate Available Upon Completion
🎓 Offered by Udacity