Choose Paranoia – in praise of Imposter Syndrome

Athene Donald posted recently on imposter syndrome, that feeling that we’re doing something way beyond our capabilities, perhaps due to clerical error or overenthusiastic “brand management”*. As I’ve touched on before, working in an interdisciplinary team exacerbates that. I’ve heard a talented researcher say “but I haven’t studied maths since the nineties”, and mathematicians wondering out loud what the modifiable unit area problem is. Not that I really know myself…

Interdisciplinary work at its best forces people out of their silos and out of their comfort zones. For example, it’s not enough to be a great mathmo if you don’t gain some understanding of the problem you’re applying yourself to – relying on someone else to deal with the nitty-gritty is not a recipe for success. In this world, everyone should feel like an imposter to some degree.

Although expressed as an afterthought in this blogpost, I have recognised in the past a reverse-Dunning-Kruger type attitude in my behaviour. Dunning-Kruger is the tendency of people to overrate their abilities, reverse-Dunning-Kruger is the tendency for competent people to overestimate others’ abilities/underestimate their own**. I recognise the thought process:

“I’m a reasonably intelligent person, but I don’t possess a unique intellect – so anything which I’m good at can’t be too hard to get good at. That person over there – they’re really good at/knowledgable about things I find really hard, and they could probably get good at the things I do quite easily, if they had the time and inclination [note to self: perpetuate the myth that physics is REALLY TOUGH so they never develop the inclination]. Oh look, there’s another person who’s an expert in a whole different difficult field. And another. Gee whiz”.

If you find this yourself: welcome to interdisciplinary research. And if you’re not working in interdisciplinary research: welcome to academia. There are lots of smart, hardworking people here. And if you’re not working in academia: welcome to the world. There are lots of bright people doing cool things.

When you look at it that way, Imposter Syndrome doesn’t seem like such a bad thing. I liked writer Leila Johnston‘s response: “Imposter syndrome is pretty good, I think, because the alternative is a world in which everyone else is as mediocre as you are.” If the choice is between paranoia and mediocrity, let’s choose paranoia.

*(I don’t know how many academics lie on their CVs – I’m assuming very few – but that is almost certainly a problem in the world at large)

**self-identifying as suffering from reverse-Dunning-Kruger might indicate an overestimation of one’s own abilities, but let’s set that aside for the time being. I’m no expert on foward-reverse-Dunning-Kruger.

Sounds of… Tottenham Court Road

Science Sociologist/Policy Academic/Blogger Alice Bell has very kindly invited me to take part in the Sounds of Science event on February 29th at Charles Darwin House – featuring participants from the BBC, Audioboo and the BMJ. To celebrate world radio day, she wrote this blogpost in celebration, in part, of the sounds around us.

In my time as a scientist I’ve worked in labs, offices, clinics and theatres. I have a particularly vivid memory of being involved with a prostate laser treatment where the theatre staff insisted on playing 80s pop on a little CD player while they worked. Sitting between a man’s stirruped legs, waiting for the treatment to finish while listening to Never Going to Give You Up gives a new definition to the phrase “rick-rolled”. But I digress.

My current office is on Tottenham Court Road (aka “TCR”) – one of London’s busiest streets. When we record Global Lab (the CASA research podcast) you can hear the sound of TCR in the background –  we frequently have to stop and let ambulances and police cars race past. But we wanted to make a feature of this – CASA is a department that has a lot of projects about sensing the city, and it’s entirely appropriate that we’re right at the heart of one of the world’s most vibrant, most historic cities. So I went out onto the street with my iPhone and captured a bit of this. This is what TCR sounded like last July:

So far, so noisy. That weird whale song sound you can hear is the noise buses make – I think it might be their brakes, reverbed by some reflections between the parallel buildings of TCR. I started thinking about whether I could make that musical. It has a certain tonality to it, and with a bit of looping, a certain rhythm. A GarageBand file was born, complete with “dance” drums, and a guitar and a bass part recorded straight into the computer and augmented with Apple’s rather passable amp simulators. This is what it sounds like:

And, GarageBand users, this is what it looks like:

Anyway, that’s how I made background noise into the theme tune for a podcast. If you want to hear (even) more interesting stories about sound and science, come along at the end of the month to Sounds of Science.

Twitter data – visualised by our MRes students

This term we’ve been running our Visualisation module as part of the CASA MRes in Advanced Spatial Analysis and Visualisation. The flavour of this module is what you’d expect – finding interesting ways to communicate complex spatio-temporal data through static, animated and interactive tools. I teach every other week, focussing on the use of Processing to programmatically represent data; 3D design whiz and course director Andy Hudson-Smith tends to work with ArcGIS, Lumion and other 3D tools.

CASA student Fabian Neuhaus‘ twitter maps have had quite an impact in the past – showing patterns of geographical twitter usage around London. We challenged our students to take a sample of the same data (collected by Fabian with Steve Gray’s big data tools) and visualise it. The dataset included the date and time of the tweet, its location (only geotagged tweets were considered), and other information like the username, what platform they tweeted from, and language. Here are some examples of what they came up with…

This was my initial (quick and dirty) stab at visualising the data:

One criticism of my initial attempt is that it lacks geographical markers – Alistair Leak tackled that problem by introducing a map. He chose not to blend time or spatial aspects, and with the underlying map this gives perhaps the most accurate representation of the data. The counter to that is that it contains a lot of visual information for the viewer to take in.

Ian Morton took an intermediate approach; a skeletal geographical boundary provides reference points for the viewer. Each tweet persists in time, shrinking and darkening over successive frames. This is a simple and effective visual grammar to provide some “history” or continuity to the vis, whilst retaining a focus on the most recent events.

Robin Edwards and Martin Dittus took a 3D approach, binning over a geographical grid in a KDE-like approach. These elegant 3D visualisations have both considered the problem of interpolation – how to move from one data state to the next. Robin has approached that problem by the bars instantly moving to the current data point, and then (if subsequent data at that point is zero), fading down to zero gradually (like an old-school “graphic equaliser”). Martin has written a smooth transition between subsequent data points – so the bars move smoothly *towards* the latest point. These interpolations enhance the polish of a vis and provide a sense of continuity in a noisy or discontinuous dataset. Martin also added a rich functionality for filtering the tweets by metadata (language, twitter platform, etc) – giving the user of the interactive app control over their view of the data.

Jack Harrison decided to dispense with space, and treat Processing as a component in a more complex workflow. He analysed temporal patterns in R and output the result to Processing to create a “clock”. By saving this as a PDF, he was able to import it into illustrator, allowing him to add the colour scheme and text and create this wonderfully Art Decon rank-clock like vis.

This is my take on the data – I’ll blog about it in more detail, but it’s essentially a Gaussian KDE with some transparency to give smooth blending between different time points as well as spatial blending. As I didn’t give the students an opportunity to feedback on my offering (after we gave significant feedback on theirs) I’m sure they will express their opinions below the line…

Academic New Years Resolutions

What are your academic weaknesses? What would you like to improve? And in 2012, how will you resolve (see what I did there) to improve them?

I suspect that many of my Academic New Years Resolutions are the same as everyone else’s: write more papers, get grants, teach better, engage with the publics better. To this, other academics might also add: do the work/life balance thing better, go for promotion, and, if many of them are honest, get a big grant and farm out all their teaching to graduate students and RAs – but these aren’t concerns for me at the moment. If we get into more detail, we start to see different sorts of academics at different career stages have quite diverse short-term goals; for some, it might be publishing their first paper (for PhD students); for others, time management or getting more students.

It can be quite difficult to talk about weaknesses in the competitive world of academia, especially if we view those weaknesses as being core to our work (and, let’s be honest, academics have a diverse and difficult to master range of skills which are held to be core to our work). However, I thought I would share the areas where I really want to get better in 2012 -  I’m interested in hearing from others what they think their weaknesses as an academic are, and how they go about improving…

As an academic, I think I have a fairly acute sense of what my strengths and weaknesses are. I’ve had a fair bit of teaching and public engagement experience; on the minus side I’ve led a fairly peripatetic academic existence, and so my publication record is not the jewel in my crown (especially in social sciences) and neither is my substantive grounding. This is sort of the opposite position that most new lecturers find themselves in – typically they will have a very strong research record but perhaps will have had fewer teaching and PE opportunities.

1: Read more and better
I’m still reading around my new subject (only 18 months in). Finding time to do it can be hard, but committing time to regular reading during the working week is really important. I do read (academic!) papers on the commute sometimes, but I’m not someone who will get home and start reading a treatise on subgraph centrality over their steak dinner. Contextualising knowledge, retaining it through note-taking – these all happen differently in social physics compared to medical physicis or a quantum physicis, and I am still learning how to do that in this new field.

Summary: Protect reading time and learn new study habits for organising knowledge systematically

2: Write more
I can be a bit of a perfectionist when it comes to writing, and I am completely aware that this comes from knowing how savaged things get at the review process. As a musician and writer I taught myself early on that work I share with the world will meet criticism, hatred and indifference as well as interest and praise, and taught myself not to care. That’s not a reasonable outlook for academic work, as people’s criticisms have impact on (e.g.) whether the work is published and are often (but not always) useful for improving the work. I personally think that the writing process will become easier as I have more confidence in what I’m presenting, and view criticism as “suggestions for improvement” rather than “an indictment of my poor scholarship”. All of this might seem terribly thin-skinned of me, but being an itinerant academic (I’ve changed fields twice since my PhD) means that there are plenty of times when I don’t know what I’m talking about.

Summary: Learn to be capable and confident in my scholarship and so to respond positively to criticism

3: Get grants
This seems pretty important. I have only a Co-I on a small grant to my name.

Summary: Start applying for grants (duh)

4: Improve teaching
I think I’m a decentish lecturer, so now is the time to build on what I view as a reasonably solid foundation and try to make my teaching better. As hinted above, I’m not someone who especially wants to get some jackpot grant and give all my teaching to a research associate – while I think that it’s useful and important for grad students and RAs to do some teaching, I want to teach and I want to teach well. And a good course will attract more students, so there are cynical as well as idealistic reasons for this, too.

How will I teach better? With the small group we have, class-led activities have worked really well, and I want to continue those and expand them into formative assessment exercises – giving students feedback about their progress and encouraging them to assess themselves and collaborate.

Summary: Improve course content and use group-led assessment

There are lots of ways I want to improve as an academic, but I suspect these will be the ones I focus on most over the next 12 months. If there are other academics and researchers out there who want to share their improvement plans and resolutions for 2012, please leave your comments below the line…. I would suggest the twitter hashtag #acNYR12 but it’s long and incomprehensible.

The loneliness of the long-distance cyclist

One of the big concerns in the use open and public data lies around privacy – whether the information you provide and is collected about could be used to identify you personally. While this might be an issue with respect to governmental or commercial entities, where I work we are very rarely interested! It’s the patterns that arise from groups of people that are interesting, and knowing that the datum I’m observing is Oliver O’Brien and he lives in Chadwick Road, Peckham* does little to add to my analysis. Now, knowing that that a data point lives in SE15, has above median salary and reads the telegraph* might be useful for some sort of analysis – but at no point do I need his actual name, and while useful, his address is not necessary. With all this data from overlapping, geographically-coded data, it’s been argued that it’s relatively easy to identify individuals, especially those in a minority (whether ethnic, fiscal, or other). While this isn’t meant to dismiss people’s concerns, particularly wrt to governmental and political organisations and businesses, I thought it worth stating the counter-example. To wit: at CASA, knowing someone’s name and address is useless – but we are interested in information about groups of people’s income, lifestyle etc.

As an example, this is a visualisation of the journey of one London Bikeshare bike on one day last year. As noted previously, we don’t have GPS data (and as far as I know, it doesn’t exist) so the routes we assign are reasonable guesses** – only the start and end points and timings are known. Secondly, we don’t know who was using the bike – that’s also hidden to us. And seeing the “path” of one bike is (I hope you’ll agree) rather interesting, but doesn’t tell us much about the system as a whole, which is what we actually care about.

And because it’s Christmas, this is what Xmas *last* year looked like for the bike scheme:

Some very slow cyclists there, making their way home after too much turkey and Christmas cheer. Merry Xmas, readers!

*none of this is supposed to reflect the actual @oobr. He’s much too cool to live in Peckham, for a start

** by Ollie O’Brien, Open Street Map and Routino

Clouds across the moon

This movie shows a heatmap of London Bikeshare activity over the course of an average day – red indicates the density of arrivals, cyan the density of departures – and so white areas are where arrival and departures match. Animation by Martin Zaltz Austwick (@sociablePhysics) with help from Oliver O’Brien (@oobr) of UCL-CASA.

This animation scales the intensity of colour to the all-time maximum – which is why the brightest colours occur at rush hour(s). Those two big dots are King’s Cross and Waterloo. This visualisation us better for comparing activity at different timeperiods, but is pretty useless for examining spatial patterns at the quieter times.

This animation scales the intensity of colour to the most intense activity at each time point. This leads to the strange paradox of the animation getting brighter as a whole outside rush hour. This is because many areas are similarly busy and no one area stands out – so many areas appear bright. This visualisation is more useful for understanding geographical patterns at each time point and is useless for comparing total activity at different timeperiods.

So how was this produced? From a network map, surprisingly. I looked at the Transport for London data of bike journeys (covering November 2010-May 2011) and, based on an average of all the data falling on weekdays, constructed a network which told me, minute by minute, how many bikes were on each route. By “route” I mean “edge” as in “it’s 10.33 – how many bikes are travelling between London Bridge and Gower Place”. Then I summed those up – so “At 10.33, how many bikes in total are on journeys that started from London Bridge” and “at 10.33, how many bikes are travelling towards Gower Place”. Network Theorists – this is broadly like in- and out-degree.*

Bear in mind that this is not the same as the number of bikes leaving (arriving) at that time point – it is the number of bikes on the road at that time point that originated (will end up) at that source (destination). The former analysis is easier to do, in fact, but my code was set up for the latter.

That yields a set of points with data about bikes which have left it, and bikes which will arrive at it. The colour scheme could easily be applied to point data, so let’s. Data is scaled to some maximum (the maximum in or out value (whichever’s bigger) either for all time or at the current time, depending on the vis). The colours are overlaid and chosen to be complementary (in this case, red and cyan) – so if the in and out activity is equal, we get White (bright white for strong in, strong out, dimmer grey for weak but equal in and out).

That’s the conceptually tricky part, if you know what Gaussian convolution is – that’s what I did next. I played around with the window until it covered the space reasonably. To speed up the process, I created two Gaussian images (one red, one cyan) with a 3sd extent and used the intensity point data to create a mask which could be used to scale the intensity of each Gaussian. Then the “new” Gaussian could be drawn, centred on the point position, and using the blend() function, the total intensity of the overlapping Gaussians added to create the heatmap. This was repeated for all the points and both the “in” and “out” point data, and when rescaling at each timepoint, a final rescaling was carried out to ensure that the full dynamic range was being used. Using Processing’s built in graphics methods seemed to be faster than “by hand” Gaussian convolution, but there are probably even faster ways to do it. Thanks to Jon Reades for hints on speeding up the calls to the MySql database where the journey data sits.

Possible extensions: cartographers would probably like to see maps. That’s fairly easily done and would enhance readability whilst sacrificing the rather abstract nature, which I like. I would also have to work a bit harder on using graphics methods for the GC if I did that. Another simple extension would be to use actual arrival/departure data rather than the proxy I describe (I suspect this proxy leads to a certain amount of time-smoothing, which has certain advantages and does not massively skew the results, I suspect).

*I divide each bike’s contribution to edge weight by its journey time so a bike on a long journey does not have undue weight on the system over all time just by appearing in multiple time windows. If I did not do this, long journeys would be more important than short ones over the course of the day. I don’t want to dwell on this but thought it important to mention – I will no doubt write about this again in the future.