ActivityNet
vision official site →
A large-scale video benchmark for human activity understanding. Provides samples from 203 activity classes with an average of 137 untrimmed videos per class and 1.41 activity instances per video, for a total of 849 video hours. The benchmark covers a wide range of complex human activities that are of interest to people in their daily living and can be used to compare algorithms for three scenarios: untrimmed video classification, trimmed activity classification, and activity detection.
Methodology
Imported from llm-stats public benchmark metadata. Modality: video. Max score: 1. Categories: video, vision. Language: en. Verified by llm-stats: no.