hansamann

Archive for June, 2009

groovytweets update 4

In General Stuff on June 27, 2009 at 11:41 pm

I quickly wanted to shout out the latest features of groovytweets that were implemented that last couple of days:

  • RSS/ATOM feeds via Google’s Feedburner (I just realize I am using 100% Google services: Hosting, Feed Hosting, Ads…). There are two feeds available: a feed with all the latest tweets and one only with the important tweets. Important tweets are tweets that have been at least retweetet once (within the community). Feedburner also offers you subscriptions via Email based on those feeds. These feeds are refreshed every 15 minutes.
  • Retweeting of important messages. Once a message has reached the first relevancy level, the twitter user ‘groovytweets‘ is now retweeting this status. I had several iterations on this one, as it was first not quite clear what measures it takes not to disturb my own retweet counting, etc., but finally it seems to work. If you follow groovytweets on twitter, this will allow you so identify ‘trending tweets’ quickly. On the other hand the email/RSS feeds allow you to catch up once or twice a day.
  • Not exactly a feature, but groovytweets now increased the threshold to follow new people. There have to be at least 3 mentions in the public timeline of another current groovytweets friend to become a new friend. At the same time, we still accept friend suggestions (send me a regular message with <suggest @username>).
  • a couple new retweet formats were added.
  • minor changes: we have a favicon, important RSS feed is linked in HTML head, etc

Thank you all for clicking the Google Ads by the way. We got a nice click-through rate, which also made me some Euros so far. Believe me, this money will flow back into the service. We just reached about 40% of the compute allowance for one day. Especially the RSS feeds (hence memcaching the data) will eat up a lot more.

I am also thinking about giving groovytweets a proper open-source license. It is just not something I am particularly good at, so I will look into this topic soon. If there are some good tutorials/guidelines out there, please let me know. I also believe that the more abstract form of groovytweets really has some business potential, so I want to choose a license wisely.

groovytweets update 3

In General Stuff on June 21, 2009 at 8:02 pm

Here’s another update from groovytweets.

As you may know, the ‘important view‘ is now done. It works quite nicely (it shows the groovy tweets that were at least retweeted once, based on users groovytweets is following). Unfortunately the technical implementation is a bit crazy. I first wanted to get a list of Tweets where tweet.importance > 0, then sort by tweet.date or tweet.statusId (statusId is a ever increasing number and the highest is the latest one). Sounds easy…. but: GAE/J does not allow you to query for one column of the bigtable db, then sort by another. I even got a special index, but nope, it seems impossible. So I finally did this: get the last 500 tweets (yes, 500, each request to the ‘important tweets’). Then in groovy, check for importance > 1, populate view. This works quite nicely, who cares as long as it works?

Another update just implemented: scanning for new groovy users to follow. So far, the additions of ‘friends’ of groovytweets has been a manual process. I simply selected the people I thought were interesting. I began logging the @replies a couple of days a ago, which already gave me some interesting insight but there is one issue with that: Twitter disabled status updates in the timeline of a following user, if the @reply user is not itself a friend of that user. That’s a bummer, as it basically does not allow me to find new groovy users.

So what I did instead now is this: every 5 minutes, I pick two random users out of the existing friends list. I get up to 200 tweets from these users, apply the same groovy pattern matching to filter out the groovy tweets and then log the @replies in those groovy tweets.  This is running for the first cycles right now and based on the results I plan to then have another cron job (probably every hour) that checks how many times a twitter screen name was mentioned in groovy tweets. Above a certain treshold, I plant to start following that user automatically. But till then, I want to monitor this a couple of days longer.

GORM-JPA Plugin: I must admit I am stillusing plain Grails 1.1.1 and the app-engine plugin, but GORM-JPA should for sure be the future for using Grails as close as possible to the original promise (with GORM) on GAE/J. I will probably create another litle test app just for trying out GORM-JPA and then modify the existing code once I know it works fine.

groovytweets.org update 2

In General Stuff on June 17, 2009 at 10:05 pm

I just polished the look and feel of the site and created a basic layout for groovytweets.org. Also, there was a bug related to @username replies where the @ would no longer show up which is fixed now. In addition to linking @usernames from tweets, #hashtags are now linked directly to twitter search for that tag. I decided to keep those kind of links in the same font as the regular status text because I believe it distracts too much to see a link every five words in a tweet… but move your mouse over the text and you will see the pointer change.

What’s next?

  • I noticed a few retweet formats still have issues. For example multiple retweets migth not get recognized… I am not sure if I can make all those cases work, but I’ll try. Also tweets posted via twitlonger are currently not recongized as retweets and therefore do not increase the relevance of the original tweet. This should be quite easy to solve, as I can just search with the original fragment that is still contained in the retweet – assuming that such an fragment is still unique enough for a tweet lookup.
  • Special views. Views like only tweets with a higher than normal relevance or views only wiht links (or both). This will allow some of you to not waste time checking the groovy news by just concentrating on the relevant stuff.
  • RSS Feeds. Yes, of course. But I first need some more interesting views. I might just create one RSS feed to keep things simple that only publishes low-volume relevant tweets. I think this is what we all want: less noise! If you want to get the full scoop, just visit the website in this case.
  • Instead or in addition to creating a RSS feed, groovytweets could start tweeting the relevant news itself. All tweets would be retweets in this case (which should work 100% as only the tweets that were ‘retweetable’ can achieve higher relevance). This might be an interesting friend for people who really just want to get the top news each day without participating in the whole retweeting hell.
  • REST API. James Williams posted this idea to my inital announcement on the Grails Mailing list. All it would take is probably a ‘list’ api, e.g. ‘render tweets as JSON / XML’ I guess. This could then be integrated into a Griffon app, a desktop client for the groovytweets site. Although I must say once groovytweets starts tweeting the news itself, following these news becomes as easy as adding groovytweets to your existing twitter desktop client. But still I am interested in getting to know Griffon.

Any other ideas? Send me an email to hansamann (at) yahoo.de. I’ll be on vacation at Tahoe the next couple of days (but podcasting again on Saturday, so watch out for a new episode with some GAE/J content). I will reply once I am back.

p.s pls retweet!

groovytweets.org update

In General Stuff on June 16, 2009 at 11:16 pm

Since I announced groovytweets.org yesterday on the Groovy and Grails Mailing lists, I have received a lot of great feedback and great ideas to improve the service. I just finished a couple of updates, but let me first explain the idea behind groovytweets.org.

Glen and me run the bi-weekly Grails Podcast. To prepare the podcast, we have to keep track of all the relevant news over the last 2 weeks, compile it into a storybook for the show and then read and discuss it. We typically use Glen’s excellent groovyblogs.org, direct emails to grails.podcast@gmail.com and the Groovy Twitter Cleaner Pipe that I created some time ago.

The Groovy Twitter Cleaner Pipe scans the twitter universe for groovy/grails/griffon and then tries to remove all the non-important and non-groovy scripting related tweets. The pipe filters the results down, for example I remove all tweets not containing links as they have little news value (no source for further information).  The Groovy Twitter Cleaner pipe sometimes provides some great new hits, but quite often it also creates a lot of noise to unrelated tweets.  It is quite hard to filter links to groovy images (60s, etc.) and keep groovy scripting related tweets. For Grails, unfortunately there seems to be a music band, too. And for Griffon, EVERYTHING can be a griffon, really: dogs, jackets, etc.

So the issue with the Groovy Twitter Cleaner Pipe is clearly quality. Too much noise. That’s why I started the groovytweets.org. Just as groovyblogs.org, I begin with a manually selected group of ‘Groovy People’ in our community. If these people tweet about ‘groovy’, it is most likely really about the scripting language. After this was done, I thought about how I can measure relevancy. That’s where retweets come into the game. Retweets are kind of endorsements, if you retweet, it means you want to share some great content with your followers. So whenever I discover a retweet in the timeline of the groovytweets friends, I swallow the RT and instead lookup the original tweet and increase that tweets relevance.

These are the Retweet formats that are currently supported (I just updated this piece and added one new format):

  • RT @username <original>
  • RT: @username <original>
  • RT: @username: <original>
  • <original> (via @username)
  • ♺ @username: <original>
  • ♺ @username <original>

Most people will probably use RT @username: <original> which I also like the most. It immediately shows the source of the original and is in my opinion the most honest retweet.

The only issue with retweets is that sometimes the retweetet messages are too long, e.g. > 140 characters. In this case one can use twitlonger.com, and I plan to support this soon, too. But the best advice regarding the retweetability of tweets is really to keep the messages to around 120 characters, so others can actually retweet them easily.

Besides the new RT formats, these features were added today:

  • groovytweets is available at www.groovytweets.org. As naked domains are not supported right now by GAE/J domains, I have to use the www.
  • @<username> in the status text is now directly linked to the twitter profle page of that user
  • the tweet matching pattern was slightly improved, I removed /gram/ which was found too often in tweets like ‘I like programming my DVR’. I will consider using word boundaries \b if gram turns out to be a still relevant groovy technology, but I have not heard too much about it so far…
  • the still basic groovytweets UI is now not only pulling in new tweets (via AJAX), it also updates the relevance of existing tweets in the HTML page every minute.

The whole app runs on GAE/J and right now uses just about 1% of the daily compute allowance. So I got hope to keep it free, but I might consider using some google ads or so. Let’s see, this is not the most important thing right now, I am first trying to get the core service right.

I also want to upload the source code to github, just right now the groovy credentials are in and I plan to refactor the credentials into a database entity soon so I can keep the source code on github complete but leave all credentials out, too.

More about my experience with GAE/J and Grails 1.1.1 and the app-engine plugin will follow in the next podcast.