Monday, May 13, 2013

Week in review: annotating multimedia content

6th - 12th May

Discovering lots of things to write about semantically annotating multimedia content.  I decided there are three main ways to do this:
  • Technical / objective / statistical data: eg. media type; shutterspeed; framerate; duration; resolution; date created; number of times viewed at a particular source; number of time shared ...
  • Bibliographic: creators and contributors and their roles; methods/location of publication; methods/locations of creation ...
  • Content*: fictional characters; locations; camera movements; scene transitions; colours ...
These categories overlap somewhat really, and when I get round to it I'll type my Venn diagram up.

Technical is easy, and a lot of that is automatically captured by hardware or software used to produce and edit works.  It's also relatively easy to extract automatically.  Standards like MPEG-7 and MPEG-21 take care of formalising it, and Jane Hunter turned these standards into semantic ontologies in 2002.

Bibliographic can largely - but not entirely - be covered by vocabularies that have been around forever like Dublin Core, FOAF and various library-originated things.  Things that might be missing (or I just haven't found them yet) are associating roles with tasks involved in digital media production, since pieces are often a collaborative effort.   has some idea of participants and roles, but the purpose of  is digital rights management stuff, so it's more concerned with the distribution change, I think, than granular production of content.  I haven't read much about it yet.

Content is more interesting, and potentially more useful for ordinary human beings.  Imagine querying IMDB for "that film where John Goodman arrests an animated talking moose on a US highway" instead of scouring John Goodman's filmography or googling for pictures of animated meese until you see the right one.  Annotating characters, objects and events, and stringing them onto a timeline is possible with OntoMedia.  It's very focussed around narratives, which is great, but doesn't link back to technical so much.  So if you did find the answer to that query, it wouldn't be able to serve up the timestamp of that particular scene.

On top of what I've looked at already, I still have this list to (re)investigate: 
A thing I want to do is annotate some amateur content with OntoMedia and with ABC to see how they compare.  Maybe I'll do asdfmovie, because it has associated comics, and multiple people participating in production.  Then I'll do something live action as well, because I can't base all my research on non-sequitur lolrandom stick figure cartoons.

Now, back to work..

* I want a better name for this, since I'm referring to everything as 'content' anyway.  So some better way of saying 'content of content'.

