Summary: “Object and Action Recognition Assisted by Computational Linguistics”.
The aim of this project is to investigate how computer vision methods such as object and
action recognition may be assisted by computational linguistic models, such as WordNet.
The main challenge of object and action recognition is the scalability of methods from
dealing with a dozen of categories (e.g. PASCAL VOC) to thousands of concepts (e.g.
ImageNet ILSVRC). This project is expected to contribute to the application of automated
visual content annotation and more widely to bridging the semantic gap between
computational approaches of vision and language.
Congratulations to Dr Amjad Altadmri for completeing his PhD degree. Amjad received his PhD degree in the formal September Graduation Ceremony at the Lincoln Cathedral.
Amjad Graduation Ceremony – September 2013
His PhD titled “Semantic Annotation of Domain-Independent Uncontrolled Videos, Incorporating Visual Similarity and Commonsesne Knowledge Bases”. The work produced a Framework for semantic video annotation. In addition, VisualNet was also produced, which is a semantic Network for Visual-related applications.
The photo shows Dr Amjad Altadmri (Left) with his Director of Studies/Supervisor Dr Amr Ahmed ( right).
Amjad has also participated, with Amr and other members of the DCAPI group, in various workshops especially the V&L EPSRC Network workshops. They presented sessions and showed posters; see related blog posts:
Congratulations to Saddam Bekhet (PhD Researcher) who achieved the “Best Student Paper Award 2013″ for his conference paper entitled “Video Matching Using DC-image and Local Features ” presented earlier in “World Congress on Engineering 2013“ in London .
Abstract: This paper presents a suggested framework for video matching based on local features extracted from the DC-image of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the real-time margin. There are also various optimisations that can be done to improve this computation complexity.
(Click Semantic Video Annotation-with Knowledge ” https://amrahmed.blogs.lincoln.ac.uk/files/2013/03/Semantic-Video-Annotation-with-Knowledge.pdf , to download the pdf)
INTRODUCTION
The volume of video data is growing exponentially. This data need to be annotated to facilitate search and retrieval, so that we can quickly find a video whenever needed.
Manual Annotation, especially for such volume, is time consuming and would be expensive. Hence, automated annotation systems are required.
AIM
Automated Semantic Annotation of wide-domain videos (i.e. no domain restrictions). This is an important step towards bridging the “Semantic Gap” in video understanding.
METHOD
1. Extracting “Video Signature” for each video.
2. Match signatures to find most similar videos, with annotations
3. Analyse and process obtained annotations, in consultation with Common-sense knowledge-bases
4. Produce the suggested annotation.
EVALUATION
• Two standard, and challenging Datasets were used. TRECVID BBC Rush and UCF.
• Black-box and White-box testing carried out.
•Measures include: Precision, Confusion Matrix.
CONCLUSION
•Developed an Automatic Semantic Video Annotation framework.
•Not restricted to a specific domain videos.
•Utilising Common-sense Knowledge enhances scene understanding and improve semantic annotation.
Amjad Altadmri has passed his PhD viva, subject to minor amendments, earlier today.
Thesis Title: “Semantic Video Annotation in Domain-Independent Videos Utilising Similarity and Commonsense Knowledgebases”
Thanks to the external, Dr John Wood from the University of Essex, the internal Dr Bashir Al-Diri and the viva chair, Dr Kun Guo.
Congratulations and Well done.
All colleagues are invited to join Amjad on celebrating his achievement, tomorrow (Thursday 28th Feb) at 12:00noon, in our meeting room MC3108, with some drinks and light refreshments available.