Amy Guy

Raw Blog

Wednesday, November 20, 2013

OKFN Glasgow #2

I ventured to Glasgow for the second Open Knowledge Foundation meetup on Monday 18th. It was well attended, and there were six short talks:

Lorna Campbell from Cetis talked about Open Scotland. I understood this to be a collaboration between Cetis, the SQA, JISC and the ALT Scotland, to do with the opening up of education, and influencing policy and practicein this area. Here's a blog.

Grianne Hamilton from JISC talked about Mozilla's Open Badges. You can use them to reward learning, skills and achievements in all sorts of areas, and any organisaiton can create and issue badge packs to people who have earned them. Recievers can then show them off anywhere they can put HTML.

Graeme Arnott talked about a collaboration between Glasgow Womens' Library and Wikimedia, which resulted in the Scottish Women on Wikipedia event. This was a group of Scottish women getting together to edit Wikipedia articles about Scottish Women, and there was very positive feedback. They have more events planned. Graeme also reminded us about Wikimania, which is taking place next August in London.

Jennifer Jones told us about the Digital Common Wealth project. She pointed out that with media-saturated global events like the Olympics, the official story is already decided before the event even starts. An alternative to relying on what is broadcast by the mainstream media is to turn the camera on the crowd, and get the 'real' version of what is going on. The Digital Common Wealth project will encourage citizen journalists to work together to craft the story of the Glasgow Commonwealth Games from their perspective. Jennifer also raised the point that although free tools like YouTube, AudioBoo and Twitter are great for spreading stories, the data is still held by third parties - what happens if they disappear? How should initiatives like this safely archive their stories, and keep them in context?

Pippa Gardner talked about Glasgow's Future Cities project, for which they have £24 million to develop. It's about "people and data", but she was here to talk about data. There's the Data Innovation Engagement (which apparently needs a better acronym) and Glasgow's data portal which has already launched. Not all of the data on their is 'properly' open, but it's more open than it was before. There's a maps portal coming soon. Follow @openglasgow to keep up to date. Someone asked how they can avoid inadvertantly widening the digital divide by making all this data available - as it will only improve things for people who already have understanding and access. Pippa said there's a dedicate group in the Council working on widening digital participation, so they're involved.

Duncan Bain, and MPhil student at the University of Edinburgh, talked about Open Architecture. He says it's hard to define 'knowledge' and 'data' in architecture; architects create drawings/representations, not buildings. There are efforts towards opening certain aspects of this, like and the Open Architecture Network, but the culture of the architecture world, and where the money is, seems to be preventing things from going in the same direction as software development any time soon.

Here are livestreams of the talks by Jennifer Jones: one and two and a twitter timeline by Sheila Macneill.

Thursday, November 14, 2013


Prewired 3

Our third Prewired event went smoothly, with 20 young people (about 4 new) and 7 or so parents attending, plus 8 mentors. So lower signups than usual (a few cancellations due to school commitments), but we decided not to do a big publicity push and see how it ran with a smaller group. I didn't notice much difference, since they organise themselves into smaller groups anyway to work on different things. I think next time we'll try to reach our capacity of 40.

We had a big group working on a variety of Python projects (games, basics, algorithms, I'm not sure what else..), a small group doing front-end web, and quite a few doing amazing things with Scratch.

Every week I discover new things these super-talented young people are doing with their time, and it won't be long before many of them are spending a lot of time mentoring their peers as well as working on their own projects.

Nantas came by to talk about what he does with the University's Robotics lab, including the challenges of making humanoid robots play football, and the state of the art, two-million-pounds, full sized humanoid robot that is moving to Edinburgh in the near future. Definitely stuff to get young people excited about learning to code.

We've been trying to encourage them to code between Prewired sessions, too, and about half of them said they had. I hope by the next time all of them have, and I'm really excited to see what they're capable of making in a few months time!


Some of the young people attending are disadvantaged by not being able to bring their own laptop, or having only really old laptops which can't support modern browsers and therefore have trouble even executing the JavaScript their writing (true story).

We'd love to be able to pay for a set of simple but up-to-date laptops that we could lend to the attendees who don't have their own during sessions.  This at least will put them on a level playing field with the others during the sessions, and I suspect that many of them have adequate desktop machines or family laptops at home.

Prewired runs on a budget of volunteer blood, sweat and tears, and zero pounds.  We're lucky enough to be able to use space in the University Informatics building for free, and there are no shortage of keen mentors and helpers willing to chip in their time (and in some cases cash for snacks).

So if you work for a company who might be able to support the purchase of resources for our young coders, or know someone who does, then please get in touch!

Sunday, November 03, 2013

Nanowrimo: It begins

I've committed to Nanowrimo for the seventh year.  I almost didn't.  It's distressing and frustrating and sucks at my self-confidence like nothing else.  It makes me feel like a failure in a way that nothing else can.

But it makes me write.

If I don't commit to it, I don't write a lot of fiction.  Maybe a burst every six months.

But every November for the last few years, I've written literally thousands of words.  I've brought vague, lingering ideas to life; I've fleshed out characters; I've explored worlds.

Every November for the last few years, I've bashed out incoherent paragraphs figuring I'll fit them in properly later. I've exhausted ideas that I now never want to hear of again.  I've killed my love for characters, and tired of worlds.

I've doubted my writing abilities, my imagination, my creativity, my storytelling.  I've convinced myself that I'm incapable of finishing anything.

The one year I hit 50,000 words? (50,299 to be precise).  I was maybe a chapter away from finishing the actual story.  The third quarter needed totally replacing and didn't really fit with the main story.  Four or five years later, I still haven't written that final chapter, even though I know what the outcomes are to be.  I haven't even typed it all up, let alone re-written part three.  I didn't fall out of love with the characters or the world, and I think about it a lot, and it breaks my heart.

Every November I've made a few new friends, and reconnected with old ones.  Bonding with someone over Nanowrimo is an experience that stands alone.  I've had one more conversation-starter than usual.  I've discovered some new cafes and new writing software.  My productivity has increased as a result of using The Work I'm Supposed To Be Doing as a distraction from writing.

Every year I tell myself I'm doing it to make myself write.  The 50,000 is irrelevant.  I just need to write some words.  More than none.  Then I'm a winner.  But not hitting 1,667 per day still feels like a gut-wrenching failure.  Finding out someone else is further on than me brings me down a notch.  Even with my inner-editor firmly silenced (she crawls into the cupboard of her own accord on the 1st of November these days) the inability to just sit down and churn out words right off the bat is crushing.

But it does make me write.

Writing fiction is my first love.  What I wanted to be when I grew up was "author".  It was a complicated word I knew when I was quite little.  Along with "aspidistra", but that's another story.

Imagine if I'd gone on to study it?  If I was writing because someone told me to write?  If I had to write to move forward in life?  I'd probably have burnt out well before now.

I guess it hurts so much because it means so much.

And that's why I have to get over myself and just get on with it.  If... when?... if I succeed, where success is writing a story I'm happy with, regardless of length, the boost will be indescribable.  I'll get a new lease on life.  I'll be sure I can do anything.

I'm going to the Edinburgh NanoBeans launch party tomorrow.  It's at 2pm in Forest Cafe.  I'm going to add loads of new people (People Who Understand) on various social networks to increase the chances of being asked how it's going.  Mostly I'll just have to shrug and say slowly, and feel guilty about that movie I watched or that extra batch of brownies I baked.  But maybe... just maybe there will be a time this year when I can say "it's going great! I'm ahead of target."

And just for some encouragement, here's are some pictures from 2008:

Thanks Nano.  I need you.  Never leave.

Friday, November 01, 2013

Sky stole £50 from me

Here is my tale of trying to get it back.

£50 isn't a large amount of money to a massive organisation like Sky. But for me, it's two to three weeks food, or a train ticket to visit my mother, or about half of my Christmas shopping (yeah alright, I'm not very extravagant with that sort of thing).
Following is an account of how a relatively small mistake on their part, which I thought was resolved well and quickly, then led to an agonising four month back-and-forth of:
a) Lies

b) Broken promises

c) Incompetencies

d) All of the above.

May: Attempted cancellation

I messaged through Help & Support to cancel my broadband and phone line, because I was going away a lot over the summer and it wasn't worth keeping it on. I was told it was sorted. I went to Australia for a month.

June: Apparently failed cancellation

I got back to Edinburgh and discovered I'd been billed. I turned on my router to see if I'd managed to mess up the cancellation myself, and saw no sign of broadband. So the services were cancelled, just not the billing.

18 July: Good customer service

I finally got round to contacting Sky about my bill; by this point I'd been billed for July as well.
I spoke to a couple of people on the phone who made me feel like it was my fault, then finally to the lovely Rachael who dug deeper, discovered a human error had occurred as a result of (we think) the two ways of writing the first line my address: 2F2 35 or 35/5. As a result, I had two account numbers... one had the services I guess, and one had the bills. It was very confusing, but once we found the two numbers Rachael properly cancelled my mysterious second account, and very kindly applied £50 credit to make up for the overbilling. I was told to contact again in a few weeks to initiate the refund process.

16-26 Sept: Being blown off

I tried several times to contact customer services via help tickets to ask for the refund but got told (by a human, not an automated error message) there 'system error' with my bill, or something totally unhelpful (telling me something I didn't ask about) in response.

29 Sept: Actually, no

I'm told the £50 credit can only be spent on Sky services, not refunded directly to a bank account.

29 Sept: I damn well disagree!

I protested the unfairness of this, since I have moved somewhere with an existing broadband provider and actually probably wouldn't be going back to Sky after all this even if I had a choice...
I was told that actually they can refund it after all, and I should expect it in 48 hours. I pointed out that I'd heard this before. I was told not to worry, it'll be fine! I relaxed.
48 hours later... no refund.

Social media

Throughout September and October I started tweeting about the problems. I liked to do things like compare Sky's customer service to that of Virgin Mobile (who have done wonderful things for me from time to time). I hoped 'public shaming' might speed up a resolution. I got encouraging responses from the social media team at Sky, who actually seemed to care (which is their job, I suppose) but really I only ended up opening more tickets and talking to other advisors in the end.

21 Oct: A change of tune

Following another help ticket chase up, I'm told credit was incorrectly applied in the first place, and has been removed. I'm not exaggerating (just paraphrasing) when I say this was done "for reasons".

21 Oct: You what?!

Oh, hey guys, how about... no? I didn't just suffer through months of torment for you try to tell me the only competant member of staff I have spoken to was actually incompetant after all!
So I had a very long live chat to William (I think; I've named him, because he didn't leave a follow-up note like he was supposed to) who "carefully reviewed my account" and agreed with the verdict that the credit was incorrectly applied. I protested. He "carefully reviewed my account" some more, and then said he could see that an explicit reason was left for the application of the credit to my account in the first place... so it should have been there after all! Yay.
He said it would be refunded to my bank account in 78 hours. I told him I'd heard this before. He said not to worry, it'll definitely be fine this time.
Guess what.

31 October: It wasn't.

So far, no refund, no re-application of credit to my bill, and not even an update to my help ticket about the conversation. It was like it never happened. I stupidly didn't think to copy the chat as evidence, although I assume they have a transcript of it somewhere.

1 November: Hope

I DMed the social media team a bit, and scheduled a time to live chat with one of them, directly. While I waited, I typed out a timeline of everything that happened so far so I could paste it straight to them.
Three quarters of an hour later, I have hope once more following a chat with the most human member of Sky customer services I have spoken to so far.
He found actual reasons for things that had been done for "reasons". For example, my initial refund was never successfully issued because goodwill credit must be issued as a cheque, not a bank account transfer, so Finance just rejected it. A silly rule, but that's the way of it. (In my case the response was just to not issue it though, so Finance can't even follow their own silly rules).
I'm still not entirely sure why the credit was totally removed though, or why on my bill its removal shows up as a charge for Sky TV.
He treated me like a person by not making vague promises, or holding back particularities of how the organisation works. He told me he believed I should be receiving a refund based on what he knew so far, but he'd have to talk to his manager. He talked to his manager, who agreed. But he didn't then just tell me everything would be fine. He told me it still might get rejected by Finance (for "reasons", I presume).
What he is doing is speaking directly to someone in Finance to get a cheque sent out. He's manually changing my address to make sure the cheque goes to the right place, and he's going to get in touch with me again on Wednesday night to let me know what the progress has been.
I asked him what the next step is if Finance reject the refund request, and he implied threats of violence. (NOTE to Sky managers who might read this: I don't believe he meant he would really commit violence on the Finance team. He was doing his job well and using humour to relieve me whilst promising he would make an effor to follow up).
He also gave me permission to verbally abuse him if he doesn't get in touch on Wednesday night. I appreciate this sentiment, though it's unlikely I'll get into capslock territory with this guy any time soon. I would tweet a gentle a reminder of course.

6 November: The Wednesday follow-up

I got a Twitter DM with a link to a live chat... After just under an hour of waiting in a 'queue' for the live chat I had to leave, and DMed @SkyHelpTeam back to ask if they'd let me know when there was someone there, so I wasn't waisting my time refreshing a page.  They responded and sent the details to my MySky help tickets instead.  And the result?

A cheque is in the post!

Please allow 28 days for it to arrive.

I sure hope that's true.  And that it's coming to the right address.  I might send a letter to my old flat, just in case.  So I guess I'll update in 28 days whether or not it actually arrived.  I'm hopeful, but they've promised me that money is on its way in x amount of time before...


If your problems aren't fixed immediately, pester the social media team (@SkyHelpTeam).  Get everything in writing, record every conversation, keep track of dates, names of customer service people, and what you were promised.  Don't give up.  For every semi-competant and sympathetic customer service person, there are four or five lazy/useless/uncaring or possibly even malicious ones.  Just keep trying, and you'll get through eventually...

Recommendations for Sky

Following my unfortunately extensive experience with Sky help ticketing, I'd like to make a few suggestions for its improvement.
  1. Tickets should be marked as resolved by the customer. I have so many tickets that I don't consider to be resolved, sitting in my 'resolved' tickets column.
  2. I should be able to reply to tickets. I post a request, I get a response that is marked as resolved that I don't agree with. I then have to follow up by opening a new ticket, which inevitably goes to a different person, and I end up going around in circles.
  3. If I'm taking the time to type out messages to you, it probably means I don't want to talk to you on the phone. It doesn't matter why. Take the time to write messages back. (Related: telling me it's free to call customer service on a Sky line is really unhelpful when I'm trying to contact you about a recently cancelled Sky line. The fact that I never physically had a landline phone, line or no, is irrelevant here).

Thursday, October 31, 2013

Nanowrimo: "Why do I do this to myself?"

T'was the night before Nano.
A world swims in my head.
I'm way scared of planning.
I'm alone in my bed.

Don't know if I'll sleep.
Oh, what have I done?
Maybe I'll dream all fifty thousand tonight.
Don't panic; it'll be fun!

I'm not aiming to 'win' though.
Just aiming wto write.
Any words better than no words.
I'm sure my PhD will be alright.

Friday, October 18, 2013

The Launch of Prewired

Several weeks of debating and planning following Young Rewired State finally came to fruition on the 16th of October, with our first Prewired event.

Thirty eight kidsyoung people arrived between 9:30 and 10 on that Wednesday morning (it was half term week in Scotland, so we weren't pulling them out of school), grabbed some kindly donated Google swag, made name badges with stickers and felt-tipped pens, and sat down for two and a half hours of lightly guided learning.

They were between the ages of three and eighteen, although the three to six year olds were more there to be tagging along with older siblings or University staff. It's obviously impossible to divide attendees up by age and decide what to work with them on, as older definitely does not mean more experienced. We had decided on no lower bound for the age limit, and no lower bound for experience either, figuring that the only real requirement is enthusiasm about programming. There was a huge mix of interests and abilities, and we let them decide for themselves which topics would be worth listening to.

We also had about fifteen students, University staff or industry professionals along as mentors.

After a few minutes of welcomes, where most of the room were willing to introduce themselves and tell us what they wanted to learn ("Python", "Scratch" and "more about programming in general" were popular ones) we kicked off with three five minute introductions: to HTML and CSS (beginner), to HTML5 Geolocation (intermediate) and to Python's Natural Language Toolkit (advanced). They then had the chance to spend 40 minutes in a hands-on session for whichever of these they chose. The groups were very evenly spread, and despite a few hiccups with Python installations on Windows and Chrome not playing nice with geolocation (worked through thanks largely to the mentors) most people got some code up and running and appropriately hacked about with by the end.

We took a break for juice, crisps, chocolate and fruit, plus a bit of hardware tinkering. We'd borrowed a Nodecopter, but hadn't managed to get it charged in time so it wasn't in the air, but there were still plenty of people interested in looking at the code to control it. We also had a demo of a robot arm, which could be controlled by an Android app connected to a Python server, which had been written over the summer by one of our mentors.

Next up were three more lightning talks: introduction to Scratch (beginner), doing cool things with Redstone Circuits in Minecraft (intermediate) and introduction to PyGame (intermediate-advanced). The following hands-on session for Scratch was under-attended, possibly ousted by the allure of Minecraft, but the PyGame session had over a third of the group and made some great progress, which was awesome.

We finished a little late, but still managed to have time for a quick demo of a football playing robot from the nearby robotics lab, and a few attendees who took their time dragging themselves away from their screens.

I'm told that overall it was a success. I was concerned because I was generally called upon when something was going wrong, so my perspectively was weighted towards the negative. But it wasn't too chaotic, none of the
kidsattendees played up, and as far as we could tell they were doing something in some way productive at all times.

A lot of them had had little to no programming experience before that morning, and I really hope they were able to take away something positive and, most importantly, feel encouraged to try things out by themselves at home. Plenty, too, had enough experience that they were calling out to correct the speakers, and helping their peers to get things working. It's a huge challenge to find enough activities to engage so many different levels of experience and interest, and I don't think we did a bad job.

Our next Prewired event will be on the 30th of October, and we're running them bi-weekly on Wednesday evenings from now on. They will be henceforth less structured. Our primary aim is to help young people to realise that with programming (and related areas) they can create anything, express themselves, and change the world. We don't wish to enforce a curriculum, but encourage them to explore areas they are interested in, learn how to teach themselves and figure out how to make what they want, and most of all to persuade them not to be afraid to experiment - to hack - and to just keep trying if it doesn't work first time. To get them excited before they become jaded and before this society's stereotypes have a chance to impact on them.

You can find out more about Prewired at, and join the mailing list there too.

Photos and feedback

Here are some of the photos from the day:

If you took some that you'd like us to add, then please send them to!

Similarly, send any feedback you have about the event to us that way, as well.


I'll update this post (as well as the website) with resources from the speakers and mentors as I get hold of them.

Beginning HTML and CSS:

HTML5 Geolocation:

Building a chatbot with Python's Natural Language Toolkit:

Intro to Scratch:

Minecraft Redstone Circuits:

Intro to PyGame:
  • Coming soon...

Monday, October 14, 2013

I touched my toes!

I touched my toes in yoga today.  It happened in the heat of power yoga, and I didn't think I'd be able to do it cold.  But I can!  That's my goal for the end of the year met then..

(This may sound trivial, but I have short hamstrings, and anything that involves bending in the middle and straightening my legs at the same time I find extremely difficult.  This the main thing I'm aiming to overcome with yoga).

Another first from today was binding without help in a spinal twist.  I managed to do this again at home an hour later, too.

And I've noticed that going into Chatarunga between sets of postures has become a reprieve, a chance to catch my breath and rest for a second.  Chatarunga is essentially holding yourself in the middle of a pressup, and when I started this class that was not something I could do, or ever though it might be a good idea to do; with the fast pace of the class, it was easy to collapse down onto my chest and skip over it.  But over the past few weeks as I've got used to how the class progresses I've slipped into doing the Chatarunga properly - or as properly as I can without having time to stop and think about it - and not having trouble at all.  I just tested that theory at home, and held myself in it for a good ten seconds.

I haven't seen such fast progress in any of the other yoga I've done.  This class is exciting me, and filling me with hope.

It's taking a toll on my wrists though.  By the end of the semester, they'll be strong :)

Monday, October 07, 2013

Yoga progress

I haven't blogged about yoga yet, but now seems like as good a time as any to start.

I started yoga-ing towards the end of 2012, with on-and-off classes at the Commonwealth Pool, then joined two beginner classes (with very different teaching styles) in January that ran for a semester as part of Edinburgh Council's Adult Education Programme.  I was hooked, and since the start of this semester I've been doing four classes a week:

Monday is Vinyasa 'power' yoga, one of the classes held by the University's Yoga Society.  It's fast, sweaty and intense, and I'd never have been able to handle it - or enjoy it - as a complete beginner.

Tuesday is a really relaxed beginner class running this term through the University Chaplaincy, mainly for relaxation. Great for the final hour of the 24 hour recovery from Monday's class.

Wednesday and Thursday are post-beginner Adult Education Programme classes, in Cameron House Centre and Nelson Hall respectively, with one of the teachers from my first semester of regular classes before the summer.

But what I really wanted to say, is that today I got myself into a full backbend unassisted (the last two weeks I've had help) and got substantially closer to reaching my toes with a straight back than I ever have before.

Go me.

(I definitely have shorter hamstrings than is normal, and my main aim with yoga is improving on that).

Monday, September 30, 2013

Angel of Death (J. Robert King)

I have a thing where I can't not finish reading something.  There's a very short list of books I never finished, and they all date from when I was about 8 to 13, and are because I was to young to follow them or too young to bear them, and just haven't got around to picking them up again.  They haunt my subconscious.

These days I feel I have to get at least half way to have given it enough of a chance, and once I'm past half way I feel I might as well finish it.

By the time I had wrenched my way through the first half of Angel of Death, I had started to come round to it.  By the end I guess I'd enjoyed it in some ways.

What I struggled with through the first half was the erratic jumping between persons and, worst of all, tenses.  You experience one character's perspective in first person present tense (something I dislike anyway), plus second person directed at another character.  And sometimes past tense.  Other perspectives were usually third person past tense.  I guess I got used to it, but if someone had told me before I started that it was written in this way, I probably wouldn't have picked it up.

I'm starting to figure out that I prefer character-driven narratives.  This is the pattern with things I've enjoyed lately, anyway.  The premise of Angel of Death was kinda interesting.  But the characters were utterly flat and often behaved unbelievably or in a very contrived manner, given how they'd been set up.  It was all tell and no show.  THIS IS A BAD EVENT IN HER TROUBLED PAST, OH BOY, NOW YOU KNOW HER MOTIVATIONS.  Totally wasn't enough.

The twists and turns throughout are, I suppose, well done.  The reader is convinced of the state of the world and, just as you're absolutely certain that that's the way it is, you're being convinced of the opposite.  This happens not quite enough to feel like an indecisive cop-out, but isn't far off.

There's gory horror, but it feels appropriate and not over-done.  Kudos.

If there were deeper levels of meaning or metaphor intended, which I suspect they might have been, I missed them.

Conclusion: meh, don't bother.  But if you've got nothing else to read, you could do worse.  If you want my copy you can have it, get in touch.

Sunday, August 11, 2013

Young Rewired State in Edinburgh #yrs2013

Young Rewired State is a week-long hack event for under 19s.  There are centres all over the UK, and the week finishes with a giant sleepover in the Custard Factory in Birmingham, presentations and prizes.

I was helping out with running the Edinburgh centre this year, between the 5th and 11th of August.  We had 15 young people taking part, and a few parents popping in and out as well.  Not to mention several fantastic mentors.

Every day we gathered in one of the University of Edinburgh Informatics computer labs.  On the first day we did some brainstorming, introduced the young people to Open Data, and they sorted themselves into teams.

We had a diverse range of projects by the end of the week.

The Weatherproof app was written in Scala with a Web frontend, and as well as telling you the weather forecast, gives you practical advice on what to wear and what to take with you.

Stuff Index was a Python Web app that lets people photograph and upload stuff they've left out on the street that they want to get rid of, so anyone browsing the site can opt to take it away if they fancy it.  Helping to keep stuff out of landfill, and without the dreaded social interactions that come with Freegle.

Tag is a game by a one-man team, with a Python game server and a JavaScript front end that lets you chase your friends around the real world, and automatically tags them when you're in range.

PokeGame is a real-world Pokemon simulator that lets you roam IRL and capture virtual Pokemon.

Great stuff!

On Friday we crammed into a coach along with the participants from Aberdeen, Dundee and Glasgow, and set off on a seven hour road trip to Birmingham for the finale.

The Edinburgh teams didn't win anything, but the presentations were fantastic and everyone had an amazing time.  The young people made new friends, learnt tons of new stuff, and hopefully remain enthused about coding.

Next year we're going to do more to walk through the creation process of some example apps to get them started off, and maybe do a better job of introducing Open Data and the possibilities it holds.

We're also thinking about starting a regular under 19s code club in Edinburgh - weekly or bi-weekly - so stay tuned for more info about that.  (And if you want to help or participate, get in touch!)

Sunday, August 04, 2013

Weeks in Review: Thesis proposal

29th July - 4th August

Discussed my ideas and work so far with Dave Robertson, my second supervisor, in two meetings this week.

Here's a summary of some of my thoughts.

5th - 11th August

I worked on my thesis proposal.  But it was also Young Rewired State week!

12th - 18th August

Finally (belatedly) handed in my thesis proposal.  (My review is scheduled for the 22nd August).

Friday, August 02, 2013

Vague thoughts about content creators and the Semantic Web

I had two meetings with Dave Robertson, my second supervisor, about what on earth I'm doing, and here is a vague summary of my thoughts afterwards.

I came to the realisation between meetings that I need to scrap the term Amateur Creative Digital Content, because amateur doesn't really apply by its true definition and creative is too subjective anyway.

Focus on content creators, not content (so previous point doesn't matter so much anyway; maybe just need to look at existing ways people are describing types of users to make it clear who I'm concentrating on).

In terms of emphasis of the thesis, I need to make a choice between taking a cognitive science/sociology perspective and a tecchie/engineering perspective (I choose tech because that's where I'm most comfortable, but the sociology side of things is still important).

(Therefore) I need to think concretely now about technology architecture.

Not to get too hung up on the Semantic Web; the technologies are a vehicle for testing theories, rather than an end in itself (though I still think facilitating a big linked data set of this sort of data is useful in the long run for research and practical applications, I didn't labour that point).

Social machines, and how Dave's process modelling language fits in, which I think I get in theory but not practice (I'd probably have to look at a working application and code to understand really). Some of the principles may be useful further down the line, but probably not the language itself or anything.

Technology-wise, I'm not thinking about anything novel or new, but more new ways for how various Web and SW technologies are combined and applied to this domain. (?)

So maybe the novelty is in marking up various things about content creators and using this to infer information about the processes they're involved in (or want to be involved in) in order to then facilitate these processes, without (necessarily) ever explicitly representing these processes (because from the content creators' perspective, they're certainly not thinking in terms of formal representations of processes, and in many cases won't know what they're trying to make until it's done, for example).

How to represent the inferences made might be novel and exciting, but I don't know.

Hmm, I still don't think I've figured out how to evaluate .. anything. Beyond comparing activities of users with magical-new-system vs without magical-new-system. And maybe, going back to the this-big-dataset-is-useful idea, by finding questions we can now ask about these kinds of communities that we couldn't before because they were so fragmented.

Sunday, July 28, 2013

Week(s) in review: #SSSW2013, figuring stuff out and annotating YouTube

8th - 14th July

Semantic Web Summer School, much heat, much fun, much learning... Here's an index of my posts.

15th - 21st July

Friends visited.  Progress included writing notes to myself to figure out just what my PhD outcomes really are, and why.  Came up with:

1. Recommending how to usefully describe diverse amateur creative digital content (ACDC) using an ontology.
    a) What are the parts of ACDC that need to be represented?  Identify and categorise properties. How do these differentiate it from other similar content?
    b) What existing ontologies can be used to do this, and how do they need to be extended?
2. Building an initial set of linked data about ACDC, and providing means for its growth and use.
    a) Manual annotation of ACDC, and refinement (to test ontology).
    b) Tools for automatic annotation of the parts of ACDC that it is possible to automatically annotate.
    c) Tools for manual annotation by the community of content creators and consumers for the parts of ACDC that cannot be automatically annotated.
    d) Tools to expose the linked data for use by third-party applications.

3. Create and test an example service which uses the linked data to benefit content creators and/or consumers.
     eg. Unobtrusive recommendations for collaborative partners (most likely); content recommendation; content consumption analysis (like tracking viral content); community building / knowledge sharing in this domain; ... .

22nd - 28th July

Brainstormed with Ewan about stage 3 (above), and came up with the idea of an interface that allows content creators to allocate varying degrees of credit for roles played by different people when collaborating on a project.  This would serve to both gather collaborative bibliographic data, learn things about how different segments of the community allocate credit, and provide a potentially useful tool for content creators.  With the future value that, if we can learn enough to estimate role inputs from different people, it could be used for things like automatic revenue sharing.

Then spent the rest of the week in London, frolicking amongst the YouTubers (including attending a meeting at Google about secret YouTube-y stuff), and annotated some ACDC.  Write-up coming soon.

Thursday, July 11, 2013

#SSSW2013: Social semantics and serendipity

We started work on the serendipity project before breakfast today, although I didn't make it down as early as some of my teammates.

To start the day, Fabio Ciravenga talked about some really exciting practical applications of monitoring and analysing social media streams.  It's particularly interesting during emergencies, or large events where problems might occur.  The people on the ground make the perfect sensors if you can work out the differences between people who are saying something useful and who aren't; people who are really there, and people who are speculating or asking about the situation.  A main problem has been that people tweet crap.  They were trying to monitor a house fire, but so many people were tweeting lyrics from Adele's various singles at the time, which all apparently contain references to fire, it was almost impossible.

They also put (or tapped into existing) sensors in peoples' cars to monitor driving patterns with the aim of more fairly charging for car insurance.  I told my Mum about this the other day, and she was pretty alarmed by the idea.  Which made me wonder how they'll get mass adoption, if it's going to go anywhere.

Fabio did have some interesting things to say about using all this data ethically though, and never working for someone who is going to take that away from you.  But in case the 'bad guys' do find out about all this data you have about people, keep a magnet handy.

My notes are here.

This was followed by a hands-on session where we got to mess with a mini version of the twitter topic monitoring system that Fabio's team use at large events, to try to answer questions about the Tour de France only by manipulating the incoming social media streams and following only links which came through that.

Spanish omelette sandwiches were an amazing outdoor leisurely lunch.  We headed to the pool down the road and chilled out there for a couple of hours.  Us tough British folk found the water pleasantly tepid, whilst all those wimpy Europeans and Latin Americans shivered on the grass.  They'd made such a fuss in advance about how cold the pool was going to be.

We regrouped that afternoon to work on Project Cusack, creating a slide deck of pictures from Serendipity.  I don't like slides with too much to read on, so I enforced this.  The imagery from the movie will be lost on most people, but we have at least managed to choose pictures of John Cusack with appropriate expressions for each part of the presentation.  We worked outside in the forest, because Oscar's 3G was faster than the residence wifi.

We also brainstormed for the required short film, which we only just discovered doesn't have to be about our project.

We returned to the residence to find everyone eating ham and cheese, and attempted to get some shots for our film, but other people were unwilling to participate.

That evening we ate tasty vegetable soup, weird (in a bad way) pasta in a creamy onion sauce, and chocolatey ice cream cake.  The tutors spontaneously organised a game where students had to arrange the tutors by age, which was funny.  Someone suggested the tutors ought to play it with the students.  Obviously there were too many students, but they elected to find the youngest student, and that turned out to be me.

[Notes] Fabio Ciravenga at #SSSW2013

Make a model of what is happening.

WeSenseIt - citizen water observations.

River belongs to citizens, not authorities.
Physical sensors (hard layer) are expensive and brittle.
So use people instead (soft layer, social).

Give people small sensors.  Phones.
Then you just need software for information management.

  • capture.
  • integrate and correlate data.
  • share.

Can't rely on phones.
Old people in Doncaster.

Give them easy sensors instead.

  • camera
  • humidity
  • position GPS
  • water depth, velocity
  • rainfall via accelerometer
  • could coverage via luminocity

Costs about EUR 80.

Open Source & hackable.

Not expected to substitute professional sensors, but a way to crowdsource information you would never get.

In Delft

Give people flood preparation advice and record who ticks things off, to build a picture of who/how/when preparations take place.

The Floow Ltd

"Commercialises data solution for telematic insurance."

World divided 10x10m squares, sense things everywhere.
Traffic risks.

Sensors tell you people are going somewhere, not why.
That's what social media can tell you.

Monitoring development of a house fire via Twitter.
Seeing events through the eyes of the community.

Social streams:

  • High volume
  • Duplicated, incomplete, imprecise, incorrect
  • Time sensitive / short term
  • Informal
  • Only 140 characters
  • Spam

Large music festival.  Monitor geolocated messages, trends, topics and relations.

Most 'critical' events were management issues.
Developing system to warn you automatically about things to pay attention to.

Look/listen for event within 72 hours.  10 minutes to find out what it was.
- Simulation of station bombing.
Minute by minute description of event.
1.5 billion messages.

  • Linguistic issues
    • Alternative language
    • Negatives
    • Conditional statements
    • Hope/prayer statements
    • Irony/sarcasm
    • Ambiguity
    • Unreliable capitalisation
    • Data sparsity

Four things when monitoring:

  • What
    • Identify, classify, cluster
      • Events and sub-events
      • Involved entities
  • Who
    • Human or not?
    • Bots can be beneign, but many are a serious risk.
    • Bots that pretend to be humans.
  • When
  • Where

Big problem - people tweet crap!
People don't realise when people nearby are in danger.

Deception on social media

False crowdsourcing political support on social networks.
Smear campaigns using bots.
Bots to foster / prevent social unrest.

Identifying bots

23 behavioural features.
Feature set is open.
Recognise 90% of bots - more than humans can do.

Very small amount of tweets are geolocated, it's useless.
Have to use the text.

Timestamp is not necessarily correct.

Issues in events

No infrastructure (eg. at music festivals).
Phone signal issues, phone charging issues.

Most tweets from outside event.


Need to convince citizens that authorities are not spying on them.
Need to convince authorities that citizens are not all criminals.

Privacy and legality issues.

Creating a company on this research would be unethical.
Need to pass the right message.  Full disclosure.  Non-intrusive use of tweet content.

What happens when authorities demand this technology for privacy-invading stuff.

Have to be careful with what you publish.
Always assume the bad guys have thought of what you thought of.
Always be in a situation where you can destroy your data at short notice.
Bit legal barrage behind them.  Know what they are/aren't allowed, know what they do/don't have to do.
Start leading a blameless life.

Wednesday, July 10, 2013

#SSSW2013: Practical semantics and human nature

Harith Alani talked about using semantics to solve problems around evaluating the success of social media use in business.  The SIOC ontology is widely used to describe online community information.  It's not as simple as measuring someone's engagement with a brand's online presence - people are 'likeaholics' on Facebook, so you have to look at someone's whole behaviour profile to judge whether their like means anything or not.  It's no good just aggregating your data and spewing out numbers - you have to browse the data and try to understand where it came from.

He mentioned how little work has been done in classifying community types.  Most of the work that has been done seems to be with social networks internal to an organisation.  A bottom-up approach to community analysis can handle emergent behaviours and cope with role changes over time.  Looking at behaviour categories and roles can help an organisation to decide who to concentrate on supporting and how in order to sustain the community.  The results they have seen so far suggest that a stable mix of the different types of behaviours are needed to increase activities in forums - but they don't know what causes what.  They're reaching a point where they can use their behaviour analysis to guess what's going to happen to a community: how long it will last, how fast it will grow, how many replies a certain type of post is likely to get, etc.

Next they want to be able to classify community types, and be able to look at activities within a community over a period of time and automatically discover what kind of community it is; it might be something different than what it was set up for.

They created an alternative Maslow's Hierarchy of Needs to correspond with activities seen on forums, and found that most people are happy to stay at the lower levels of the hierarchy.  For example, join a community, lurk for a bit, ask one question and leave.  Not everyone wants or needs to be a power user.

Papers are being written that find patterns in individual datasets for a particular community in a particular context.  Harith and his team are getting tired of this; they want to generalise across communities.  So they took seven datasets and looked at how the analysis features differed as well as comparing the results across community types, randomness (vs. topicality) of datasets, and compared similar experiments.

Upcoming work includes the Reel Lives project, in which UoE is involved.  They're taking media fragments - photos, videos, audio clips, text recorded as audio - and creating automated compilations to tell a story.

Another is social methods to change energy consumption behaviour.  LiSC in Lincoln did something in this area back in the day.. an app that posted that you were listening to an embarrassing song on your facebook feed if you left your lights on.

Notes from Harith's talk are here.

From Tommaso Di Noia's talk, I learnt that recommender systems have a lot of maths behind them, especially for evaluating things, and reinforced something I already knew: I don't maths good enough to be taken seriously by most of the Informatics world.  I think I understand the principles behind the maths, but when something is descried in just maths, I have no idea what it relates to.  I'll work on this.

Real world recommender systems use a variety of approaches, including collaborative (based on similar users' profiles); knowledge-based (domain knowledge, no user history); item-based (similarities between items); content-based (combination of item descriptions and profile of user interests).  Linked Open Data is used to mitigate a lack of information about entities, and helps with recommending across multiple domains.  You do have to filter the LD you use before feeding it to your recommender system though, to avoid noise.  Notes here.

Tommaso's talk was followed up by a hands-on session, where we got to poke about with some of the tools he mentioned, including FRED (transforms natural language to RDF/OWL); Tipalo (gets entity types from natural language text); and using DBpedia to feed a recommender system.

Then we worked on our mini-projects for the afternoon.  We made some progress towards breaking down the concept of serendipity and working out what properties we might need to represent as linked data, and how we could observer a user and work out if/when/how they were having serendipitous experiences without intruding too much.

In the evening we took a coach to 'nearby' historical town Segovia.  Apparently an extremely motion-sickness-inducing two and a half hour coach journey around twisty mountain paths is 'nearby'.  Fortunately I was distracted from this horrible journey by a conversation with Lynda Hardman, which I wish I had recorded.  Lynda challenged various aspects of my PhD until I could explain/justify them reasonably, including:

  • Why digital creatives? (I'm used to that one now).
  • What is the outcome?
  • Why Semantic Web for this?

She also recommended a number of resources, including theses of her recent former students to help me with a structure for my own, and advice on maintaining a healthy balance between thinking and doing.

Plus she used to live in Edinburgh, more or less across the road from where I live now.  Cool.  Thanks Lynda!  You haven't heard the last of me :)


Once we got to Segovia, we had a guided tour of the ancient Roman architecture, interesting building façades and local legends.  It was a very good tour, but too hot to really focus.  Then they took us to a restaurant for a local speciality.  I was all set to write a whole individual blog post surveying the barbaric nature of human beings, but I didn't do it straight away and now the passion has faded slightly, so I'll leave it at a paragraph.  Some people watched the local 'ceremony' out of morbid curiosity I imagine, but it was the fact that so many people took so much pleasure in the idea of violently hacking up bodies of three-week-old piglets that really bothered me.  Fortunately the surging standing crowd allowed me (and only one other) to inconspicuously sit it out.  The veggie option was tasty, but it was difficult to really enjoy the rest of the evening whilst wondering vaguely about the states of minds of most of the people I was sharing a table with.

[Notes] Tommaso Di Noia at #SSSW2013

Tools and Techniques

Recommender systems

Input: Set of users + set of items + rating matrix.
Problem - given user, predict rating for an item.

In real world, recommendation matrix data is sparse.

Can use hybrid approaches.

Collaborative RS:

  • Like Amazon.
  • Based on other users with similar profiles.
  • Experimentally better than content-based, but you don't always have many users.

Knowledge-based RS:

  • No/little user history.
  • Based on domain knowledge.

User-based collaborative recommendation:

  • Pearson's correlation coefficient - baseline.
  • Imagine millions of users - computing similarities takes a lot of time.
  • So ..

Item-based collaborative recommendation:

  • Focus on items not users.
  • Compute similarity between each pair of items.
  • Don't have to compute similarity between items that don't have overlapping ratings.
  • Cosine similarity / adjusted cosine similarity (taking into account average rating related to a user to eliminate some bias).

Content-based RS:

  • Based on description of item 
  • and profile of user interests.

  • Items are described in terms of attributes/features.
  • Finite set of values associated with features.
  • Item representation is a vector.
  • Don't necessarily have complete descriptions of items - just have a 0 in your vector.

  • Similarity between items: 
    • Jaccard similarity.
    • Cosine similarity and TF-IDF (term frequency - inverse document frequency).
    • Batch compute similarities offline, then use similarities to compute ratings on the fly based on user profile.

  • Predict rate only for N nearest neighbours of items in user profile, that are not in the user profile.
  • An item is worth rating if more than x of N number of neighbours are within user profile.

Using LOD

To mitigate lack of information/descriptions about concepts/entities.

Recommender systems are usually vertical, but LD lets you easily build a multi-domain recommender system.

To avoid noisy data, you have to filter it before feeding your RS.



  • Automating typing of DBPedia entities.

Vector space model for LOD

  • MATHS.

[Notes] Harith Alani at #SSSW2013

Social Media Analytics with a Pinch of Semantics

Using semantics to solve problems (not solving problems of semantics).

SM for businesses:

  • Analytics.
  • How to measure success?

SM silos impeding progress.
In-house social platforms increasing, so even more so.

SIOC to integrate online community information.

FB Graph.
People are likeaholics.  Their 'likes' become meaningless, so you need to take this into account when making recommendations.
Browse your data and understand user actions.

Behaviour analysis.

Bottom-up analysis.
Can handle unexpected or emerging behaviours.

  • Community members classified into roles.
  • Identify unknown roles.
  • Cope with role changes over time.
  • Clustering to identify emerging roles.

eg. focussed novice; mixed novice; distributed expert; ...
Spectrum across users you can or can't do without.

Extending an ontology built on SIOC.

Encoding rules in ontologies with SPIN.

Three categories of features:

  • Social features (people you follow, people follow you, ...)
  • Content features (what you're posting, keywords, ...)
  • Topical/semantic features

Which behaviour categories you need to cater for more than others?  How roles impact activity in online community.

Consistently see that you need some sort of stable mixture of behaviours for activities in forums to increase.

==> Don't know what's causing which.

What is a healthy community?

Use behaviour analysis to guess what's going to happen to community. Eg.

  • Churn rate.
  • User count.
  • Seeds/non-seeds prop (how many / if people reply to you).
  • Clustering.

Unexpected: the fewer focused experts in the community, the more posts received a reply.
(But quality of answers?)

Community types (Little work in this space)

Muller, M. (CHI 2012) community types in IBM Connections:

  • Communities of Practice
  • Teams
  • Technical support
  • ..
  • .. (see slides..)

Need an ontology and inference engine of community types.
Wants an automated process to tell you what type of community it is - it might be something it wasn't set up for.
Then you could determine what sort of patterns you would expect to find.
Noone has done this yet.

Measurements of value and satisfaction

Answers different across communities.  They ran it on IBM Connections - corporate community.

Most of this work is for managers of communities - see what's happening and help to predict what might be coming next.

Can classify users based on Maslow's Hierarchy of Needs?
Mapping the hierarchy to social media communities.
~90% users happily staying at the lower levels of the 'needs hierarchy'.

Behaviour evolution patterns

What paths they follow over time.
eg. people who become moderators eventually.

Engagement analysis

What's the best way to write a tweet so that people care about it?
Which posts are likely to generate more attention?

Getting bored of people finding patterns in individual datasets.  What can be generalised to other communities?

So experimented with 7 datasets and looked at how results differed across:

  • community types.
  • randomness (vs. topicality) of datasets.
  • related experiments.

And people use different features.

Semantic sentiment analysis in social media

Too much research going on, especially on twitter.

Extract semantic concepts from tweets; likely sentiment for a concept.
Semantics increases accuracy by 6.5% for negative sentiment; 4.8% for positive sentiment.

Students don't use in-house networks because they already use facebook groups etc. Want to analyse what's happening on them.


Reel Lives (inc. Ed.)
Fragmented digital selves.
Want to automate compilations of media (photos, messages) posted online.

Changing energy consumption behaviour.
Providing information is not enough.

Social Eco feedback technology.

Tuesday, July 09, 2013

#SSSW2013: Collaborative ontology engineering and team formation

We were introduced to the various mini-projects on Tuesday morning, and encouraged to form teams with people who weren't from the same university.  I quickly shortlisted the five that sounded most interesting to me, but was disappointed that there weren't any about multimedia.  Because how to evaluate a very subjective system is a potential problem for me, the project proposed by Valentina Presutti was my first choice:

"Serendipity can be defined as the combination of relevance and unexpectedness: an information is considered serendipitous if it is at the same time very relevant and unexpected for a given user and in the context of a given task. In other words, a user would learn new relevant knowledge. To evaluate the performance of a tool (e.g., an exploratory search tool, a recommending system) in terms of its ability to provide users with serendipitous knowledge is a hard task because both relevance and unexpectedness are highly subjective. This miniproject focuses on two main research questions: what is the correct way of designing a user-study for evaluating an exploratory search tool performance in terms of serendipity? Is it possible to build a reusable set of resources (a benchmark) for evaluating ability to produce serendipity, allowing easier evaluation experiments and comparison among different tools?"

Nobody else seemed to be interested though, so I resigned myself to not being able to do it... until I explained the project and why it was interesting, to the best of my ability, to Andy, Oscar and Josef, and they were sold enough to mark it as our first choice.  Thus Team Anaconda Disappointed (a name of significant and mysterious origins) was born, and Project Cusack (because of the movie Serendipity, which nobody got) was underway.

Our first lecture today was from Lynda Hardman, about telling stories with multimedia objects.  It was super relevant to what I'm doing, to the point where I'm surprised I hadn't come across her work already.  My notes are here.  Lynda has done, for example, work with annotation of personal media objects like holiday photos in order to combine them into a media presentation.  She has considered similar things to me, in particular noting that there are many many aspects of data about multimedia - I had assembled my take on this into a Venn diagram for my poster..

One I hadn't considered is annotating an explict message of a piece of media, intended by the creator.  This isn't always relevant - sometimes the consumer's interpretation of the media is more important - and this in itself might be an interesting annotation problem.  Competing perspectives - something an ontology should be able to represent.

I need to check out COMM - Core Ontology for Multimedia.

She has an overview of the canonical processes they have consolidated the process of producing digital content into, and how annotation can be formed around these.

Lynda also told us about Vox Populi and and LinkedTV; practical applications of annotating multimedia.

I made lots and lots of notes.

Natasha Hoy gave us some insights from the biomedical world with regards to ontology development, particularly in relation to the International Classification of Diseases which, when last revised in the 80s, consisted of a lot of paper and a whoever-shouts-the-loudest algorithm for inclusion of terms.  But the next version, currently under creation, is being developed with a version of Web Protege, customised to be friendly for those who don't know or care about ontologies, and is a truly collaborative process (for those allowed to take part) with accountability for all changes.  It's open too though, so even those without modification rights can view and comment on the developments.  My notes are here.

Lunch was for the first time outside, under the shadows of the forest, and for me was a tray of tomatoey vegetables that were delicious but few.  A striking contrast to Monday's lunch.  Everyone else had some meat-potato combination, preceded by a salad with tuna, and followed by a peach.

The hands-on session followed on from Natasha's talk.  We teamed up (temporarily Anaconda Hopeful) and played with Web Protégé.  There were two magazines and two newspapers, each with four departments.  Anaconda Hopeful were randomly designated the Advertising Department of Iberia Travel (a food and travel magazine).  We got stuck in, on paper first to identify some classes and relations that were relevant to us, and then with Web Protégé, along with the other departments of Iberia Travel.  We didn't come into any conflicts, but ended up creating a few classes that we needed, but should really have been the remit of another department (I guess we just got there first).

Then it was announced that Iberia Travel had bought the other magazine (and one of the newspapers had bought the other), and we had to work together to merge ontologies with the other department.  It became apparent that the other magazine had never had an Advertising Department (no wonder they went under!) so we had no-one to attempt to merge ontologies with.  We attempted to sell our expertise to the Advertising Departments of the newspapers, but there were already too many people involved in the heated debate that came out of the ontology merging there, so we couldn't really get involved.

Later we got cracking with our mini-projects.  Valentina showed us aemoo, and the experiments her team had come up with to try to evaluate it.  We sat down by ourselves to brainstorm, describing a lot of concepts for ourselves, breaking down the notion of serendipity, figuring out what might be wrong with existing experiments to 'measure' serendipity, and collating literature in the area.  (Turns out there is a lot, and it's a very interdisciplinary issue; lots to read about from social sciences, anthropology etc, as well as philosophy of science.  In computing, it seems to be primarily discussed within the realms of recommender systems and exploratory search).

Serendipity seems to be mainly described as a combination of unexpectedness and relevance.  Problems include the sheer subjectivity of it.  Some people are going to get excited by all facts they find out, whether they're useful or not.  Some people are going to have hidden, inexplicit or subconscious goals that affect how 'relevant' something is to them.  People describe their different areas of expertise in different ways; some are more humble than others and would not call themselves an expert in a topic, for example.  So whether or not an event can be considered a serendipitous one is a complex question, which must take into account the person's background, goals and existing knowledge, the task they are trying to achieve (or lack thereof, as serendipity is particularly important - in my opinion - in undirected, loosely-motivated activities), the way they are able or encouraged to interact with a system, what they are doing before and after... all these things make up a context for someone's activities, and none of them seem to be particularly measurable.

Dinner was a vegetable and potato (yay!) starter, followed by spaghetti in tomato sauce (fish for everyone else, although Andy got a custom omelette, lah-de-dah).  Also an apple.  We learnt the hard way not to sit at a table directly underneath a light, as the bugs just raiiiin down.

After dinner we crowded around Enrico who had offered to provide advice about PhD-ing.  From this session, I have a signed diagram of the life of a PhD, because he borrowed my notebook to make it.  I tuned in and out of the discussion, and noticed some irregularities between my PhD and what seemed to be 'normal'.  For instance, most people didn't seem to have as much control over their topic, or what they were doing at any given moment in their first year.  I am really, really enjoying my freedom, but in order to justify that I deserve it I need to sort out my lack of direction and focus.  I need to believe in what I'm doing - not be told by someone else - which is one of the main reasons I am doing this particular PhD.  Perhaps I need to ask for more guidance to more quickly reach the necessary conclusions for myself.  (And, of course, perhaps I also need to stop taking big chunks of time out periodically for different reasons; that might speed up the process as well).

Later, the overriding sentiment was that the job of a PhD student was to answer a question, to produce a theory.  Not to create a system or solve a large problem; certainly not to worry about practical, real-world applications of theories.  Well, I've already explained that this is something I can't accept, and I still am not convinced that that is going to impact on my ability to do a PhD.  Theories develop during practice.  Coding and designing, like writing, are part of my thought processes, and I reach realisations or find new questions to ask through hacking and playing and making.  And why would I be hacking and playing and making, if not to try to produce something of real-world value?  If my motivation in making a system is explicitly to come up with new theories, then my approach and outcomes and realisations will be entirely different.  In trying to make something that works for real people, not researchers in a restricted domain or specific context; a clean and sterile laboratory, I figure out different things, that matter.

There was another discussion that I came into a bit late, but it sounded like a very harsh discussion about problems with research in industry (rather than academia) that seemed to be very overstated compared to what I have read and experienced myself.

By the end of the day, it felt like I'd been at Summer School for weeks, and had known everyone forever.

[Notes] Natasha Noy at #SSSW2013

Stanford, Protégé.

In past 10-15 years, through collaboration with scientists (particularly biomed), ontologies have become essential.

Don't need to sell ontologies to scientists, they believe in it.

Focus on science because that's where she has experience etc.

We're not so bad at versioning ontologies, more versioning data is the problem.

Experts add stuff, curator checks quality, and publishes upcoming tasks.

Similar to open source developments, but no research to compare the two.
- Different because biomed people are paid (well).

ICD - International Classification of Diseases.
  • Started 17th century.
  • Causes of death, medical bills, policy making.
  • Revised in 80s over 8 annual conferences.
    • 17-58 countries, 1-5 person delegations, mainly health statisticians.
    • Manual, on paper.
    • Whoever shouted loudest..
    • Paper copies, only English, pdf.
  • ICD-11 - OWL ontology!
    • Open, Protégé (a customised, Web version), links to others.

Conflict resolution:
  • People naturally don't step on each others' toes.
  • Users expect stuff like Web 2.0 interactions, Web interface.
Web Protégé:
  • No consistency checking - coming but currently must go offline.
  • Ontologies are solution to everything - versioning, roles, social interactions.
  • Also plugins are the solutions to everything - visualisations.

[Notes] Lynda Hardman at #SSSW2013


Users (consumers?):

  • Finding content
  • Media types * mostly text at the moment, little integration of different types
  • Specific tasks - not much connection of results with user tasks.

More data than just what you seen in the media (cue my Venn diagram).

Plus, eg. paintings - lots of 'cultural baggage'.

Care more about the story than the media.
Interpretation by end users.  Hopefully message that the author intended.

Meaning of combination of assets.
eg. Exhibition of artists work.

Interacting further with the media.

  • Search - serendipitous or focussed around a theme (or both).  Different search goals.
  • Sharing, passing it on.

(SW and multimedia community need to work together).

-> Raphael Troncy on Friday - attaching semantics to multimedia on the Web.

Need mechanisms:

  • to identify (parts of) media assets.
  • associate metadata with a fragment.
  • agree on meaning of metadata.
  • enable meaningful structures to be composed, identified and annotated.

Workflow for multimedia applications

  • Canonical processes of media production
    • Reduced to the simplest form possible without loss of generality.

Heard of MPEG-7? Don't bother.. very much from a media algorithms perspective.


  • Feature extraction.
  • News production.
  • New media art.
    • An interactive exhibit that responded to audience present.
  • Hyper-video.
    • Linked video.
  • Photo book production (CeWe).
    • (Using this example for explaining processes).
  • Ambient multimedia systems with complex sensory networks.

Canonical processes overview...

There's a paper.

CeWe photobook - automatic selection, sorting and ordering of photos.
Context (timestamp, tags) analysis and content (colours, edges) analysis.

Things from these you want to represent your digital system (ie with LOD):

  • Premediate, eg.
    • remember to take your camera on holiday.
    • write scripts, plan shots.
    • place a security camera in the right location.
  • Construct Message (not really in the chain, appears all over the place); what to conveny with media? Intention? eg.
    • show people a great holiday.
    • sell a product.
  • inform/advise.
  • Create (method of creation might be important, so record in metadata), eg.
    • take photos.
    • make video.
  • Annotate, eg.
    • automatic or manual.  Stuff that is embedded by device vendors (but there's so much more...)
    • domain annotations: landscapes/portraits, timestamps, face recognition.
  • Publish, eg.
    • compose images into photobook.
  • Distribute, eg.
    • print photo book and post.
    • cyclic processes online.

COMM - Core Ontology for Multimedia.

Premediate and construct message - human parts, she doesn't expect them to be digitised any time soon.

Using Semantics to create stories with media

Can we link media assets to existing linked data and use this to improve presentation?

How can annotations help?

  • What can be expressed explicitly?
    • Message (somewhere between a html page and poetry).
    • Objects depicted.
    • Domain information. <--- li="">
    • Human communicaiton roles (discourse). <--- li="">

Vox Populi (PhD project)

Traditionally video documentary is a set of shots decided by director/editor.
Annotating video material and showing what the user asks to see.

Annotations for these documentary clips:

  • Rhetorical statement; argumentation model (documentary techniques).
  • Descriptive (which questions asked, interviewee, filmic).
    • Filmic: continuity like camera movements, framing, direction of speaker, lighting, sound - rules that film directors know.
  • Statement encoding (eg. summary what the interviewee said):
    • subject - modifier - object statements.
    • Thesauri for terms.
    • Can make a statement graph, finding which statements contradict and which agree.
    • (He encoded this stuff by hand - automated techniques aren't good enough).
    • Argumentation model - claims, concessions, contradictions, support.

Automatically generated coherant story.

  • Are we more forgiving watching video? (Than reading these statements as text).  Peoples' own interpretations strongly affect understanding of the message.

Vox Populi has (not for human consumption) GUI for querying annotated video content.

User can determine subject and bias of presentation.
Documentary maker can just add in new videos and new annotations to easily generate new sequence options.

User informatio needs - Ana Carina Palumbo

Linked TV.  Enhancing experience of watching TV.  What users need to make decisions / inform opinions.

  • Expert interviews (governance, broadcast).
  • User interviews - what people thought they need (215 ppts).
  • User experiments - what people actually need.

Experiment - oil worth the risk?

  • eg. people wanted factual information from independent sources; what the benefits are; community scale information.

Published at EuroITV.


  • We can give useful annotations to media access, useful at different stages of interactive access (not just search).
  • Clarify intended message. Explicity with annotations.
  • Manual or automatic.
  • Media content and annotations can be passed among systems.
  • No community agreement in how to do this. <--- li="">
  • How to store?


Hand annotations are error prone - how to validate?
Media stuff - there can be uncertainty, people don't always care.

Motivating researchers to annotate...
Make a game.

Store whole video or segements?
W3C fragment identification standards - timestamps via URLs.

Monday, July 08, 2013

#SSSW2013: Research in theory and practice, and where on earth am I?

The 10th Summer School for Ontology Engineering and the Semantic Web


Arriving by train into Cercedilla, north of Madrid, we immediately encountered other confused looking folk with poster tubes.  So we shared taxis (EUR 10) from Cercedilla station to the summer school residence further north, in the forest.

After getting keys for our pleasant, single, en-suite rooms, arrivals congregated in the shade by the building  to introduce ourselves.. Again, and again, and again, as new people continuously arrived over the space of a few hours.

A really broad mix of people are here in terms of nationalities and places and levels of study, but I still haven't quite got used to the fact that answering 'Semantic Web stuff' is not specific enough in this crowd, when someone asks you what your research is about.  Nobody needs convincing that these technologies are useful!

Later we received schedules, maps, ill-fitting t-shirts* and very helpful name badges, and headed for dinner at the bar down the road.

As is traditional when I write about my experiences in new places, I will describe the food every day.  It has become apparent, at this residence at least, that variety of ingredients is not ordinary, so in this respect meals are simple.  Dinner that first night started with a salad (lettuce, olives, tomato, onion, shredded beetroot and a single slice of hard boiled egg; no dressing), followed by - for the majority - slices of meat (beef? Pork? I dunno..) and fries.  Mine was a plate of mushy green vegetables with a little seasoning, that was pretty tasty.  Dessert was a single pear, delivered with ceremony, but otherwise unadorned.  Healthy, at least.

Yet we were all (those I sat with at least) were left feeling a little unsatisfied.

I shared a table with a French, Spanish, Italian and Irish guy.  Conforming appropriately to stereotypes, and setting up reputations for the rest of the week, the French and the Italian shared the bottle of wine on the table; the rest of us went without.

I returned to bed after a couple of hours of socialising and enjoying the cool air in and around the bar.

* For next year, they could ask for t-shirt sizes when they ask for dietary preferences?


The day started early, and with no hot water or wifi for anyone.  Breakfast was combinations of sweet pastries, coffee, tea, juice and bread.

Punctuated variously by coffee breaks, the learning began in earnest.

During the introduction by Mathieu D'Aquin, I found out that I am one of 53 students selected out of 96 applicants to attend this year's Summer School of the Semantic Web!  I had no idea it was that selective, or that there had been that much competition.

The first keynote was by Frank van Harmelen, about all the Semantic Web questions we couldn't ask ten years ago.


Frank started by saying that the early Semantic Web vision has morphed into the more manageable vision of a Web of Data, or a Giant Global Graph, and outlined the principles of the Semantic Web as they appear to stand at present:

1. Give everything a name (entities).
2. Relations form graph between things.
3. Names are addresses on the Web (so we inherit properties of Web like AAA).
4. Add semantics.

Frank pointed out the advantages of the fact the Linked Data crowd, grown naturally and not designed, is now so big we don't know how many triples it contains, nor how fast it is growing.  Companies and organisations (like Google, NXP, BBC, DataGov) are using Semantic Web technologies to achieve their own ends, for a variety of different use cases, without caring much about the Semantic Web, and this is contributing to the growth.

This growth has given rise to a number of research areas that were impossible to realisitically ask questions about ten years ago, including self-organisation, distribution of data, provenance, dynamics and change, errors and noise (how to deal with disagreements).

Frank asserted that rules and structures, algorithms and patterns in data, exist whether we are looking at them or not.  He used the analogy that OWL is our microscope, and it may be the tool that distorts our vision of the information universe rather than properties of what we are looking at (for example, structures in data presenting themselves well in some domains but not others).

He went on to promote the roll of the Informatician to be to test theories, hypothesis and falsify, as scientists rather than engineers.  To discover, rather than build.

I struggle with this view of the world, and feel instinctively that theory and practice are intrinsically linked; one can't exist without the other, not just in the grand scheme of things, but in day to day work and research.  This is one of the main points of contention with my own PhD, and I've no doubt there will be many more blog posts about this issue in the near future as I reconcile my need to create something immediately useful with the necessity of producing a contribution to knowledge at large.

See my raw notes here.

We had an Introduction to Linked Data by Mathieu D'Aquin (raw notes here), followed by a workshop.  We wrote SPARQL queries to populate a pre-written web page with information about Open University courses, sub-courses and locations thereof.

Lunch, similar to the previous night's dinner, was a starter salad, an entire half chicken (or something) plus fries for the carnivores and the most unappealing risotto of my life for (not that I'm ungrateful, but I have never been unable to finish a meal due to boredom before).  I went for a walk with some others to grab some fresh air before the afternoon's work, and missed out on watermelon.

Manfred Hauswirth presented some really exciting stuff about annotating and using streams of data.  Particularly challenging is how to integrate this with static data and make inferences over the lot.  Streams include sensor data, as well as ever-flowing social media streams for example; anything that changes over time.

They've built some systems to process this kind of data, and one of them is available as middleware.

My raw notes are here.

In the afternoon we had a poster session, where all participants pinned up posters about their work, and discussed at length with anyone who was interested.  Here's evidence that I participated.

And here's Paolo's:

I wrote a few notes about things from other peoples' posters that I need to look up.

The main feedback I received was about making sure I focus, narrow down my topic, and concentrate on some evaluatable deliverables that are PhD-worthy.

Questions like (paraphrasing) "why should we care about digital creatives?" threw me, because I thought the obvious answer - that they are people too, Web users, technology users, contributors to culture and an ecosystem of digital content and data - was apparently not enough from an academic standpoint.

I was simultaneously told to focus more, and to explain why the problem I'm trying to solve is applicable to all domains, not just digital creatives.  But some of the problems I'm looking at have been (or are being) solved in other domains (like e-health, biological research, education) and the reason what I'm doing is interesting is because none of these solutions quite work for digital creatives, and I want to find solutions that do, and try to figure out why.

I'm still stuck in some sort of struggle between theory and practice; thinking and doing.  And the long-standing problem of how to decide which doing actually worked.

I've started scribbling notes about the narrowing down problem.  I'll need to have this figured out before my first year review in August anyway, so stay tuned for another post all about it.

Then I sneaked off for a nap.

Dinner at the bar again; the usual salad, plus some eggy fish thing for most.  I got a plate of artichoke.  Artichoke is great, I love it, and I'm all for simple meals.  But I remain unconvinced that a plate of only artichoke constitutes an acceptable level of effort on the part of caterers.  And the sheer quantity made it start to taste a bit funny after a while.  But not to worry; we rounded off with a solitary peach apiece.

Further socialising, and appreciation of the night sky, before returning to bed write blog posts.

I'm super excited and inspired by the talks, work I've heard about so far, and the atomsphere of the place.  I'm excited to learn a helluva lot, and remind myself that I'm not facing impossible problems, and am not facing many problems alone.  I remember that I am instinctively passionate about the Web and the possibilities it holds (and indeed has already realised) for the empowerment of individuals.  I remember how lucky I am to be able to sustain myself through studying something I love so much, and to have the potential to make a change, and through my work maybe even facilitate others to be able to make a living doing what they love, as well.