groovytweets update 11

groovytweets v86

Link tracking and ranking

Finally! I have been busy checking out all kinds of things and I finally spent that other hour to finish one mingeling feature in groovytweets: Link tracking and ranking.

It is a notable feature, as the newsworthyness of tweets really lies in the links that people tweet. For groovytweets, of course, these are just the links with a groovy context and out of the groovy community that we trust.

One important aspect of link tracking that I noticed early on was that trackign the bit.ly links itself does not make sense. Too many people follow the bit.ly (and others) redirects, then convert the same target url into another short url and here we go: we have to links. So link tracking in groovytweets is based on the final destination a short link takes you to… groovytweets resolves links and follows the Location: headers till the end.

And as we are on twitter, the link count itself is not enough. Someone once said, that information older than 48hrs is practically useless for the Twitter-generation. The link ranking takes the freshness of a link into account and reduces the overall ranking score over time. Older links will automatically loose a lot of ranking points just because they are old, making space for the new rising ones.

On a side note: I recently have received another request to include search into groovytweets. I am looking into it, but things just would be way nicer if I had a relational DB ‘LIKE’ etc. On App-Engine, I have to build my own search index if I want to provide a fast solution and that’s where things can get (compared to a grails app with a relational db) unnecessarily complex. I am not totally sure if I am doing it at all, but I have some ideas in mind.

So good for now!

Advertisements

groovytweets update 10

It’s more than 2 months ago since I blogged about the groovytweets status, but there have been numerous minor updates and improvements. The friends list (the Twitter users we collect tweets from) has been expanded to ~422 followers (and likely more when you read this), the regular expressions used to decide if a tweet is ‘groovy’ has been adapted to the changing groovy universe (like gparallelizer renamed to gpars or following vmware news now), and so it goes on.

But the real meat is a bit behind the scenes. The features you’re likely to see quite early are language detection (filtering by language) and a new link ranking. I still have to improve the quality of language detection as tweets often use English terms even if the tweet itself would be written in a different language. Larger texts submitted to the Google Translation API of course yield better results; tweets just having 140 characters makes this a bit harder.

The groovy link ranking feature can already be seen live in an early version. I am now collecting the links and tracking their usage in tweets the same as retweets for tweet. The nice thing is that I am tracking the final URL, so if someone used bit.ly to create a short version of a URL I am actually following the redirects to find the final destination. Next, I am prepared to limit the links shown in the UI to the last weeks (2 currently) and in addition the relevancy of the links degrades over time. This means a link from today having 5 mentions in the groovy community will eventually be higher ranked than a link from yesterday having 6 mentions, simply becuase time is an important factor for relevancy.

Links

Links

The real real change for groovytweets is yet to come though. As you might have heard, the new Twitter Retweet API is on it’s way. It has been changed multiple times now, based on a lot of user input flowing to Twitter and hopefully even mine. It will fundamentally change how Twitter aggregators/relevancy tools can count retweets. For now a Retweet was a community-agreed syntax, like RT @originaluser text. In groovytweets code I was analyzing each incoming Tweet to decide if it fits in one of the many retweet syntaxes and tried to find the original tweet, then tried to look that tweet up and increase the relevancy.

Well, now Twitter is making the Retweet an official concept of Twitter. They even give you a new API method to look up the total retweets of a tweet, which sounds great. The downside is that each Twitter account may currently use 150 API calls per hour. If I wanted to update 50 tweets displayed on the groovytweets homepage every minute, this means 50 Tweets * 60 Calls per hour = 3000 calls per hour. Well, I got 150. An that is not including the minutely check on new tweets coming from groovytweets friends. So: we’re in trouble here. One solution would be to get whitelisted for more API calls, but there is a better one (or two).

The one solution I still got some hope for is that Twitter will simply include a retweet count with each Tweet. The problem here, I guess, is that I am interested in the retweets within a specific community only. And providing the count only for *my* friends instead of a global retweet count (which is way less relavant some might argue) might potentially be a pretty resource intensive task for them.

The next and more likely solution involves using the Twitter Streaming API. The good thing about the API is that it will show retweets. Although the API just changed again, making the Retweet now the top element instead of the Tweet (and including a retweet_details element), it is then very easy to detect a Retweet. The bad news: Groovytweets is hosted on Google Appengine, and Appengine kills each request after about 30 seconds. So I invested some time finding a cheap vServer on which I open a permanent streaming connection to Twitter. I will then call an API over on groovytweets to feed the retweet information into the app. This splits the system into two parts, which I wanted to avoid, but it looks like the best solution.

Follow me @hansamann to get the news as it happens.

groovytweets update 9

Hover over retweetGroovytweets v59 just went live and comes with a new feature: the ability to see who actually retweeted a tweet. So far I did a simple count but then did not persiste the ‘retweet tweets’. Since about one day I am adding retweets as child entities to the original tweet. And just a few minutes ago, I completed the UI integration of this feature so that you can now hover over the green retweet button to see who retweeted this tweet.

As I just started collecting the retweets in the GAE datastore, only retweets for the messages of the last 24 hrs are available, so if you want to try it out, hover over a colored tweet of the last 24hrs.

Another quick feature that is not really super shiny but very important is that you can now directly request groovytweets.org instead having to use http://www.groovytweets.org. The solution I have found now involves a redirect to a bit.ly url which then redirects back to http://www.groovytweets.org. I know this is lame, but my domain hoster (united-domains.de) really does not allow me a redirect on the naked groovytweets.org to http://www.groovytweets.org, they fear a endless loop and don’t allow it. So… you hardly notice the second redirect and I’ll just register future domains somewhere else :-)

Enjoy, and again thanx for not clicking my Google Ads as this is against the policy…

groovytweets update 8

OAuth supportAnother feature that was blocking me from working on other things is finally out the door on groovytweets: OAuth support. It’s a big one, at least for me. Supporting OAuth in combination with Twitter means that you can now ‘Sign in with Twitter’ and once you have done this, just press the green retweet links to directly fire off retweet messages. You don’t need to leave the page and in case we successfully sent off the message, the retweet button will be somewhat transparent to indicate the retweet was sent.

Underneath, I am storing your OAuth credentials (token and tokenSecret) in the session (and in the app-engine data store to keep track of the logins). At any time, you can revoke groovytweets this right to act in your name by going to the twitter/settings page and revoking access.

That’s the great thing about OAuth: groovytweets does not store your username and password, instead we just authenticate with twitter and thereby get authorized. The user stays in full control and can revoke access for any application any time.

The OAuth signing is done with twitter4j, an excellent twitter API for java. There were some issues with regarding to serialization in app-engine, but these have been solved in the latest 2.0.9 SNAPSHOT of twitter4j.

I hope this feature makes retweeting even more popular. All you have to do now is to log into groovytweets and retweet your favourite tweets. It’s great for the community as we get a great relevance indicator and it is quicker than retweeting from your desktop Twitter client.

Enjoy!

groovytweets update 7

Preview ImagesA couple of noteworthy updates just went live as preview of groovytweets. Keep in mind that www.groovytweets.org might still show an older version without these features, click the preview link to see the new stuff.

So what has changed?

  • The user infoboxes (hover over the twitter user icons) have been refactored and this feature has been expanded to the important tweets screen, too. Still I need to show/hide some rows like bio or url, but I felt refactoring and implementing it on the important tweets screen is more important (or: call me lazy)
  • getsatisfaction.com has been integrated on all pages via a change to the main layout. This is a service to gather feedback/ideas/bugs from the users, notice the feedback box on the right hand side of the page? Just click it to see what I mean.
  • The ‘groovy’ tweet detection will now work on the pure status text of a tweet, meaning usernames will not count as statusText per se. I noticed some tweets were aggregated due to the content having a @mention like @groovyusername, which would make it pass just because of the username. This is now no longer the case. (to be exact: once I switch the preview to the default version)
  • Preview Images: yeah, for me personally, that’s the big one. Just as the infoboxes, it will require some cleanup and refactoring during the next days, but: move your mouse over any link within a tweet. You will notice an overlay appears that shows a preview of the link. The preview generation may take a while the first time someone hovers over it, after that it is cached by our webthumbs service provider. Glen from groovyblogs.org told me about this service which he considers himself. It is a really useful thing plus great eye candy. I will try to wrap the webthumbs service into a plugin so we can all have more previews :-)

Also noteworthy: groovytweets is now running on Grails 1.2M1 using the app-engine plugin 0.8.3. Had some minor hickups installing the plugin (I think the uprade reinstalled the hibernate plugin, which then had to be uninstalled manually), but the nasty EntityManagerFactory Exception seems to be gone.

Enjoy.

groovytweets update 6

shows twitter user information

shows twitter user information

Another feature of groovytweets just went live in version v38. If you move your mouse over the twitter user icon of a twitter message, you will now see a popup with some key user information like follower/friends count, location, web and bio. You can also start following that user by clicking on the large green follow link, which takes you to the twitter follow page to follow the user.

I hope you’ll like this feature. I know about certain little issues, e.g. if the user has not filled out his profile you might see a null here and there. I will clean this up the next days and only present the information that is really available of course.

I am also watching the results of the latest grailspodcast poll: What features would you like to see implemented in groovyblogs.org and groovytweets.org. One feature that will be in shortly will be the timestamps for tweets. One of the initial ideas and another reply was to create a Griffon Desktop App that pulls the tweets. I could think if a nice Growl integration, too… but let me tell you that I really first have to catch up with Griffon. I think I see my personal Griffon Pet Project coming :-)

groovytweets update 5

It’s again getting really late (early) so I am trying to keep this one short. Just today, two new cool features were added to groovytweets:

  • retweet from within groovytweets. You will have noticed the funky green retweet buttons below the twitter user screen names. Clicking these buttons will bring you directly to a twitter update status page, if you are already logged into twitter. Otherwise, you first have to sign in and are then taken to the update status page. The status is prefilled with the retweet message. There is currently no check if the actual message you are trying to retweet is retweetable, e.g. if there is enough space left to make it a retweet. If a message is too long and does not end with an URL (many do), you may now shorten the message and append … to the shortened message. Groovytweets can still detect this retweet and assign a higher relevancy to the original in this case.
  • new user scanning now includes our followers. We now scan a random follower from time to time and check how many groovy tweets he has produced over the last 200 tweets. If we find 2 tweets, we start following that user.  To make this feature work, I also had to update the data we save from the social graph, namely the followers are now also memcached and updated each hour.

That’s it for today – have a good one.