20241116-20241120


Current detection systems repurpose classifiers to perform detection. To detect an object, these systems take a classifier for that object and evaluate it at various locations and scales in a test imaage. Systems like deformable parts models (DPM) use a sliding window approach where the classifier is run at evenly spaced locations over the entire image.

We reframe object detection as a single regression problem, straight from image pixels to bouding box coordinates and class probabilities. Using our system, you only look once (YOLO) at an image to predict what objects are present and where they are.

We implement this model as a convolutional neural network and evaluate it on the PASCAL VOC detection dataset.

Unifying the complicated process that conventional approaches use with a neural network seems to be the appealing point of this research.

To fully understand the model, I should study neural networks first.


infinite lb

Total infinite kcal


MUST:

TODO:


index 20241115 20241121