# Flickr photo view counts: an elementary analysis

I was taking a casual look at the number of views on my flickr photos, when I noticed something that should not appear very surprising: view counts are low for the first few days, then gradually grew to a higher region (around 100 for me). This idea came to me to actually plot the view counts of the photos against the number of days they had been online, to visualize the trend. So I did it using Excel. You just need to enter the date of upload and the current date into date-formatted cells, and then the usual subtraction command happily gives you the number of days between those dates, so that was pretty easy. Here’s the result from my rather limited dataset of 33 photos:

There are some statutory notes before drawing any conclusion from this graph. One is that not all photos grab the same attention. Some are better than others, and will digress from a trendline decided simply by the number of days passed, like the highest point in this chart, which is, according to me, the best photo I have posted to flickr so far. It is not expected therefore that a graph like this should show a smooth pattern because there are other factors that affect views, like its quality, and how well it was shared and publicized through various social media etc. Also, as I slowly gather contacts and people who follow my photostream and watch for my uploads, I’ll expect new uploads to get more attention than another of the same quality did in the past.

Even keeping all these in mind, though, there seems to be some degree of rise in this pretty scattered graph. The linear correlation coefficient (although I don’t expect the correlation to be linear) is around 0.39. That’s about a third of the way upwards from totally random. Extending that observation, if I imagine a statistically averaged trendline over many photos of different qualities and different degrees of online publicity, i.e. I want to think only of the effect of days passed, several properties of such a trendline curve logically come to my mind:

• It shall start from the origin.
• It shall be monotonically increasing, of course. Photos cannot be unviewed once they’ve been viewed.

Wait, did you fall for that second one? Because I’d be surprised if you didn’t. I fell for that myself, until just some time back when I relented to humor a tiny splinter in my brain that was groaning against this argument ever since I thought about it. The groaning ensued from memories of a related puzzler in ensemble averages that I had encountered in Statistical Mechanics once, and eventually turned out to be quite legit.

The truth is, there’s actually no reason why that averaged curve should necessarily be monotonically increasing. Why? Well, a point on that curve has a certain x-coordinate, and so corresponds to the average over all photos that are a certain number of days old. Another point, with a different x-coordinate, is an average over a different (and completely mutually disjoint) population of photos. And while the average view count of a fixed set of photos must necessarily go up with time (each view count goes up, so sum goes up, so average goes up), nothing can be said of a comparison between view counts of two disjoint sets of photos at different points of time. It might very well be that the photos you posted five years ago have never received the limelight so long that your now awesomely professional photos have hogged in just a few months. Thus, the averaged curve may at times even drop with increasing online age. Which, in fact, my scatter plot seems to indicate to some degree.

Thus, while a time series plot with gradually falling y-coordinates (where this coordinate means something good, like views) is in almost all cases bad news, now I know that in this case it is a most enviable sign of growing reputation.

So we must strike out that second property. On to the next:

• There will be an initial spike in views as the photo is uploaded and the ripples spread through flickr to your contacts, to other pages, and possibly through linked accounts to other sites. This means a higher slope near the beginning, which decays eventually at a rate that I don’t know anything about at the moment, except that it will probably be of the order of a couple of days.
• In the long run, when these transient effects have decayed out, the only thing that keeps view counts going up is the fixed background rate at which people chance upon your photos on flickr. I don’t know what this rate is. But whatever it is, barring the reputation effect I mentioned, it can be assumed to be fixed for a flickr profile, unchanging in time. In real life though, it rises when you gather more contacts, increasing the audience that can discover your photos by some avenue. It’s in no way a negligible effect. Reputation and recognition matter. In fact, that’s finally what most people on flickr and elsewhere are striving for. But ignoring that effect, asymptotically the trendline should become a straight line with positive slope.

There are several curves that have all these properties, like the familiar parabola. The actual curve that will fit this hypothetical data is unknown at this point, of course. The trendline I fitted with my dataset was a parabola, with no manipulated parameters, all floating, and it clearly shows the fall towards the end. Although I strongly suspect this could be a contribution from that outlier high point (that’s where the hump of the curve is). With my hopelessly insufficient data, this is all pretty arbitrary at this stage:

That’s all I wanted to say, and by itself this is not very interesting stuff, but maybe someone will get some other interesting ideas from this. Like maybe plotting a reputation growth curve calculated from the departure (fall) of this view count curve as compared to the idealized, constant-reputation view count curve which asymptotes to a rising straight line as I mentioned.

1. niaz

nice

Like