Tag Archives: data

Virtual Politics: New ideas

So during today’s consultation, I realised that Fusion Table was not presenting my data all that clearly. So Juan used another tool I had completely overlooked – Excel’s own graph generator. Which produced some interesting plots with my data (In the graphs below, I’m leaving out the numbers and grid lines since it’s the pattern that’s more interesting):

(A) Data which reveal the nature of his posts

1. PM Lee’s posts’ time against the date:

Screen Shot 2014-10-23 at 7.11.32 PM

 

2. The number of times PM Lee appears in his photos against the date.

Screen Shot 2014-10-23 at 7.19.25 PM

3. The number of adults in his photos against the date:

Screen Shot 2014-10-23 at 7.22.11 PM

4. The number of children in his photos against the date:

Screen Shot 2014-10-23 at 7.23.14 PM

(B) Data on public response to his posts

1. The number of likes against the date:

Screen Shot 2014-10-23 at 7.25.10 PM

2. The number of shares against the date:

Screen Shot 2014-10-23 at 7.27.31 PM

3. The number of comments against the date:

Screen Shot 2014-10-23 at 7.28.55 PM

And then as we were discussing, Diana had this brilliant idea that the line graphs could be converted to music! Excited to see where this brings us.

Virtual Politics: Technical Realisation

​I managed to complete collating my data for 100 of PM’s Lee’s Facebook posts, and visualised them using Google Fusion Table, which Juan taught me how to use earlier today:) This tool is really useful for giving a rough overview of the nature of the numbers I’m dealing with, and also helps me spot typo errors in my assignment of the categories.

Here are some of the interesting  findings I made via Fusion Tables, which I seek to explore further in my project:

 

Screen Shot 2014-10-22 at 3.26.59 AM

Pie chart showing the percentage of photographs taken by different sources. Most are by his photographer Terence Tan, while Lee’s own photos rank second.

Screen Shot 2014-10-22 at 3.24.45 AMPie chart showing the different countries at which the 100 photos were taken.

Screen Shot 2014-10-22 at 3.29.57 AM

I assigned each post a general category and the number of likes, and Fusion Table helped me calculate the average number of likes for each category. So it looks like it’s Lee’s posts on sports that attract the most Likes, though only in the context of these 100 posts.

Notice that there’s two overlapping categories “Community Events” and “Community events” – it’s a silly typo I made. But at least Fusion Table highlighted that to me.

Screen Shot 2014-10-22 at 3.34.13 AMEach time Lee appears in a photo, I assigned it a value of 1, so this scatter plot shows the number of times he appears in his photos each day over the duration of the 100 photos.

 

Screen Shot 2014-10-22 at 3.35.19 AMThe average number of adults (blue) and the average number of children (red)that appear in each photo over time.

Screen Shot 2014-10-22 at 3.36.38 AMA comparison of the number of likes against the number of shares over time.

Screen Shot 2014-10-22 at 3.37.11 AMA comparison of the number of shares against the number of comments over time.

 

Screen Shot 2014-10-22 at 3.43.38 AMLee’s locations within Singapore in his posts. I’ll have to find out the exact geocodes to make this more comprehensive. But this is one feature that I’m hoping to include in my final website (which will probably be another WordPress site, more details coming soon).

Screen Shot 2014-10-22 at 3.45.11 AMLee’s locations throughout the globe in these 100 posts. Again there are some inaccuracies I’ll have to rectify.

I also tried out the data with ImagePlot, and am quite puzzled by the results:

 

 

 

xaxisnooflikes

 

I set the x-axis to just a series of numbers (1-107) and the y-axis to the number of likes here. Unlike my visualisation with the 30 images, this time my graph seems to be running in a circular-ish form, like Manovich’s Instagram Cities. Problem is, I don’t really know how to interpret this. It’s as though the data has turned into a complete abstraction for me.

time and no of likes

This visualisation is clearer, with x-axis set to time and y-axis set to the number of likes. I think that the straight rows indicate all the photos which were posted at the same time (Lee’s FB page has a little quirk here, he typically publishes about 4-5 posts within the same minute. In some cases it’s because he manually stated the time, but in others it genuinely seems to have been posted in that manner).

My biggest concern is that the ImagePlots here seem pretty sparse, so increasing the number of images to 300 per politician, while decreasing the number of parameters so that I can still complete this on time.

So here’s my plans for the technical realisation of this project.

1. Create a WordPress site: http://hackingvirtualpolitics.wordpress.com 

2. Create a home page briefly explaining my project aim: To “hack” the Facebook pages of politicians, which are in essence, PR spectacles to improve their reputation. One way is to analyse the nature of the data (via Fusion Table), and another more abstract means, is to visualise their “data thumbprints” (via ImagePlot)

3. Create another page, “Analytics”, that compares the Fusion Table results of 100 photos each of the three politicians

4. Create a third page, “Thumbprints”, that shows visualisation of 300 photos each of the three politicians, but only based on the parameters of likes, shares, comments, date and time.

5. Create a fourth page, “OSSNTU” that links back to my project documentation and our class site.

Trails we leave on the Internet

Phototrails is a series of data visualisations revealing the mark each individual leaves on the Internet via their posts on the social networking platform Instagram, and the collective effect of every individual’s post on a “spatial and temporal level”.

Screen Shot 2014-10-15 at 1.48.08 AM

 

Visualisations of Instagram photos uploaded within the specific territorial spaces

Art-Science fusion 

Data visualisation has been largely associated with the sciences, with notable milestones including the Human Genome Project  and the advent of chemical  imaging.

Manovich notes that while his tool is provided by the sciences, he wields it with an artistic purpose, thereby “painting with data”:

“We also have to use the same kind of charts and labels because it’s almost like a standard language used by science. But as an artist I am also interested in the question of how can I present the world through the data…Thinking about landscape paintings in Impressionism, Fauvism, or even Cubism, how could I represent nature today through the contributions of millions of people? So I think of myself as an artist who is painting with data. “

Screen Shot 2014-10-15 at 2.30.47 AM

A montage (artistic technique) of data (scientific stuff)

Manovich also critiques the journalistic interpretation of data as the absolute truth:

“I’m basically trying to say that as opposed to a journalist who thinks about the “data” as a kind of truth, that it’s a way to find out what happened, what I’m thinking about is its own reality… It’s not a question of truth, it’s a question of making interesting connections.”

As a journalism student, this insight is especially captivating for me. A journalist is essentially a “data miner”, going out into the thick of the action to collect as many numbers, witness testimonies, photographs and sound bytes as we can – data which eventually determines the news angle. Manovich however, questions the assumption that data is objective, and instead proposes a new way of looking at data: from a macro perspective to form connections rather than to draw conclusions.

Technical realisation

I am intrigued by Manovich’s choice of Instagram as a repository of photo data. I initially thought that it was probably because Instagram has the richest collection of photographs, which would produce aesthetically-appealing visualisations. But through this week’s reading, realised that he is in fact motivated by the urge to democratise photojournalism:

“When popular media covers exceptional events such as social upheavals, revolutions, and protests, typically they just show you a few professionally shot photographs… Instagram has its own biases and it’s definitely not a transparent window into reality, but would give us, let’s say, a more democratic picture. “

The avoidance of Facebook and Twitter are also intentional, for Manovich was mindful that the uneven power distribution on these platforms would distort the data he obtains:

“We chose Instagram, specifically, because it was not an active tool for citizen journalism. Rather, it was used by a much smaller number of people. It wasn’t dominated by a few power users or by a few voices. “

The software Manovich and his team use is ImagePlot, which you can download for free here (hurray for open source software!!! It takes pretty long to download though). The techniques involved in creating the visualisations are also made transparent on here.

Open source culture

I am really inspired by Manovich’s creative exploration of the subjectivity of data through Phototrails, and am grateful that the thought processes and technicalities have been documented in such a transparent manner.

Screen Shot 2014-10-15 at 2.32.44 AM

The visualisation techniques are clearly documented for everyone to understand

Like the hacker culture of the pre-Microsoft days, the data visualisation culture is approaching, if not already experiencing, its heyday. I find it absolutely amazing that data visualisation artists and scientists are largely working in an open-source environment right now, with the data-mining software and apps freely available for anyone. This is the ideal environment to generate innovative and thought-provoking data visualisation pieces that could fundamentally change the way we think and work.

However, I also have this fear that someday, the art/science of data visualisation will be commodified, and that the open source culture will be replaced by tech giants enforcing their proprietorship of data mining tools.

Is this just another irrational fear of mine? I sincerely hope so.