hansamann

Archive for 2009

groovytweets update 11

In General Stuff on November 2, 2009 at 8:59 pm
groovytweets v86

Link tracking and ranking

Finally! I have been busy checking out all kinds of things and I finally spent that other hour to finish one mingeling feature in groovytweets: Link tracking and ranking.

It is a notable feature, as the newsworthyness of tweets really lies in the links that people tweet. For groovytweets, of course, these are just the links with a groovy context and out of the groovy community that we trust.

One important aspect of link tracking that I noticed early on was that trackign the bit.ly links itself does not make sense. Too many people follow the bit.ly (and others) redirects, then convert the same target url into another short url and here we go: we have to links. So link tracking in groovytweets is based on the final destination a short link takes you to… groovytweets resolves links and follows the Location: headers till the end.

And as we are on twitter, the link count itself is not enough. Someone once said, that information older than 48hrs is practically useless for the Twitter-generation. The link ranking takes the freshness of a link into account and reduces the overall ranking score over time. Older links will automatically loose a lot of ranking points just because they are old, making space for the new rising ones.

On a side note: I recently have received another request to include search into groovytweets. I am looking into it, but things just would be way nicer if I had a relational DB ‘LIKE’ etc. On App-Engine, I have to build my own search index if I want to provide a fast solution and that’s where things can get (compared to a grails app with a relational db) unnecessarily complex. I am not totally sure if I am doing it at all, but I have some ideas in mind.

So good for now!

groovytweets update 10

In General Stuff on October 8, 2009 at 10:09 pm

It’s more than 2 months ago since I blogged about the groovytweets status, but there have been numerous minor updates and improvements. The friends list (the Twitter users we collect tweets from) has been expanded to ~422 followers (and likely more when you read this), the regular expressions used to decide if a tweet is ‘groovy’ has been adapted to the changing groovy universe (like gparallelizer renamed to gpars or following vmware news now), and so it goes on.

But the real meat is a bit behind the scenes. The features you’re likely to see quite early are language detection (filtering by language) and a new link ranking. I still have to improve the quality of language detection as tweets often use English terms even if the tweet itself would be written in a different language. Larger texts submitted to the Google Translation API of course yield better results; tweets just having 140 characters makes this a bit harder.

The groovy link ranking feature can already be seen live in an early version. I am now collecting the links and tracking their usage in tweets the same as retweets for tweet. The nice thing is that I am tracking the final URL, so if someone used bit.ly to create a short version of a URL I am actually following the redirects to find the final destination. Next, I am prepared to limit the links shown in the UI to the last weeks (2 currently) and in addition the relevancy of the links degrades over time. This means a link from today having 5 mentions in the groovy community will eventually be higher ranked than a link from yesterday having 6 mentions, simply becuase time is an important factor for relevancy.

Links

Links

The real real change for groovytweets is yet to come though. As you might have heard, the new Twitter Retweet API is on it’s way. It has been changed multiple times now, based on a lot of user input flowing to Twitter and hopefully even mine. It will fundamentally change how Twitter aggregators/relevancy tools can count retweets. For now a Retweet was a community-agreed syntax, like RT @originaluser text. In groovytweets code I was analyzing each incoming Tweet to decide if it fits in one of the many retweet syntaxes and tried to find the original tweet, then tried to look that tweet up and increase the relevancy.

Well, now Twitter is making the Retweet an official concept of Twitter. They even give you a new API method to look up the total retweets of a tweet, which sounds great. The downside is that each Twitter account may currently use 150 API calls per hour. If I wanted to update 50 tweets displayed on the groovytweets homepage every minute, this means 50 Tweets * 60 Calls per hour = 3000 calls per hour. Well, I got 150. An that is not including the minutely check on new tweets coming from groovytweets friends. So: we’re in trouble here. One solution would be to get whitelisted for more API calls, but there is a better one (or two).

The one solution I still got some hope for is that Twitter will simply include a retweet count with each Tweet. The problem here, I guess, is that I am interested in the retweets within a specific community only. And providing the count only for *my* friends instead of a global retweet count (which is way less relavant some might argue) might potentially be a pretty resource intensive task for them.

The next and more likely solution involves using the Twitter Streaming API. The good thing about the API is that it will show retweets. Although the API just changed again, making the Retweet now the top element instead of the Tweet (and including a retweet_details element), it is then very easy to detect a Retweet. The bad news: Groovytweets is hosted on Google Appengine, and Appengine kills each request after about 30 seconds. So I invested some time finding a cheap vServer on which I open a permanent streaming connection to Twitter. I will then call an API over on groovytweets to feed the retweet information into the app. This splits the system into two parts, which I wanted to avoid, but it looks like the best solution.

Follow me @hansamann to get the news as it happens.

groovytweets update 9

In General Stuff on July 29, 2009 at 9:19 pm

Hover over retweetGroovytweets v59 just went live and comes with a new feature: the ability to see who actually retweeted a tweet. So far I did a simple count but then did not persiste the ‘retweet tweets’. Since about one day I am adding retweets as child entities to the original tweet. And just a few minutes ago, I completed the UI integration of this feature so that you can now hover over the green retweet button to see who retweeted this tweet.

As I just started collecting the retweets in the GAE datastore, only retweets for the messages of the last 24 hrs are available, so if you want to try it out, hover over a colored tweet of the last 24hrs.

Another quick feature that is not really super shiny but very important is that you can now directly request groovytweets.org instead having to use www.groovytweets.org. The solution I have found now involves a redirect to a bit.ly url which then redirects back to www.groovytweets.org. I know this is lame, but my domain hoster (united-domains.de) really does not allow me a redirect on the naked groovytweets.org to www.groovytweets.org, they fear a endless loop and don’t allow it. So… you hardly notice the second redirect and I’ll just register future domains somewhere else :-)

Enjoy, and again thanx for not clicking my Google Ads as this is against the policy…

groovytweets update 8

In General Stuff on July 22, 2009 at 9:25 pm

OAuth supportAnother feature that was blocking me from working on other things is finally out the door on groovytweets: OAuth support. It’s a big one, at least for me. Supporting OAuth in combination with Twitter means that you can now ‘Sign in with Twitter’ and once you have done this, just press the green retweet links to directly fire off retweet messages. You don’t need to leave the page and in case we successfully sent off the message, the retweet button will be somewhat transparent to indicate the retweet was sent.

Underneath, I am storing your OAuth credentials (token and tokenSecret) in the session (and in the app-engine data store to keep track of the logins). At any time, you can revoke groovytweets this right to act in your name by going to the twitter/settings page and revoking access.

That’s the great thing about OAuth: groovytweets does not store your username and password, instead we just authenticate with twitter and thereby get authorized. The user stays in full control and can revoke access for any application any time.

The OAuth signing is done with twitter4j, an excellent twitter API for java. There were some issues with regarding to serialization in app-engine, but these have been solved in the latest 2.0.9 SNAPSHOT of twitter4j.

I hope this feature makes retweeting even more popular. All you have to do now is to log into groovytweets and retweet your favourite tweets. It’s great for the community as we get a great relevance indicator and it is quicker than retweeting from your desktop Twitter client.

Enjoy!

groovytweets update 7

In General Stuff on July 11, 2009 at 10:38 pm

Preview ImagesA couple of noteworthy updates just went live as preview of groovytweets. Keep in mind that www.groovytweets.org might still show an older version without these features, click the preview link to see the new stuff.

So what has changed?

  • The user infoboxes (hover over the twitter user icons) have been refactored and this feature has been expanded to the important tweets screen, too. Still I need to show/hide some rows like bio or url, but I felt refactoring and implementing it on the important tweets screen is more important (or: call me lazy)
  • getsatisfaction.com has been integrated on all pages via a change to the main layout. This is a service to gather feedback/ideas/bugs from the users, notice the feedback box on the right hand side of the page? Just click it to see what I mean.
  • The ‘groovy’ tweet detection will now work on the pure status text of a tweet, meaning usernames will not count as statusText per se. I noticed some tweets were aggregated due to the content having a @mention like @groovyusername, which would make it pass just because of the username. This is now no longer the case. (to be exact: once I switch the preview to the default version)
  • Preview Images: yeah, for me personally, that’s the big one. Just as the infoboxes, it will require some cleanup and refactoring during the next days, but: move your mouse over any link within a tweet. You will notice an overlay appears that shows a preview of the link. The preview generation may take a while the first time someone hovers over it, after that it is cached by our webthumbs service provider. Glen from groovyblogs.org told me about this service which he considers himself. It is a really useful thing plus great eye candy. I will try to wrap the webthumbs service into a plugin so we can all have more previews :-)

Also noteworthy: groovytweets is now running on Grails 1.2M1 using the app-engine plugin 0.8.3. Had some minor hickups installing the plugin (I think the uprade reinstalled the hibernate plugin, which then had to be uninstalled manually), but the nasty EntityManagerFactory Exception seems to be gone.

Enjoy.

groovytweets update 6

In General Stuff on July 7, 2009 at 8:40 pm
shows twitter user information

shows twitter user information

Another feature of groovytweets just went live in version v38. If you move your mouse over the twitter user icon of a twitter message, you will now see a popup with some key user information like follower/friends count, location, web and bio. You can also start following that user by clicking on the large green follow link, which takes you to the twitter follow page to follow the user.

I hope you’ll like this feature. I know about certain little issues, e.g. if the user has not filled out his profile you might see a null here and there. I will clean this up the next days and only present the information that is really available of course.

I am also watching the results of the latest grailspodcast poll: What features would you like to see implemented in groovyblogs.org and groovytweets.org. One feature that will be in shortly will be the timestamps for tweets. One of the initial ideas and another reply was to create a Griffon Desktop App that pulls the tweets. I could think if a nice Growl integration, too… but let me tell you that I really first have to catch up with Griffon. I think I see my personal Griffon Pet Project coming :-)

groovytweets update 5

In General Stuff on July 3, 2009 at 11:36 pm

It’s again getting really late (early) so I am trying to keep this one short. Just today, two new cool features were added to groovytweets:

  • retweet from within groovytweets. You will have noticed the funky green retweet buttons below the twitter user screen names. Clicking these buttons will bring you directly to a twitter update status page, if you are already logged into twitter. Otherwise, you first have to sign in and are then taken to the update status page. The status is prefilled with the retweet message. There is currently no check if the actual message you are trying to retweet is retweetable, e.g. if there is enough space left to make it a retweet. If a message is too long and does not end with an URL (many do), you may now shorten the message and append … to the shortened message. Groovytweets can still detect this retweet and assign a higher relevancy to the original in this case.
  • new user scanning now includes our followers. We now scan a random follower from time to time and check how many groovy tweets he has produced over the last 200 tweets. If we find 2 tweets, we start following that user.  To make this feature work, I also had to update the data we save from the social graph, namely the followers are now also memcached and updated each hour.

That’s it for today – have a good one.

groovytweets update 4

In General Stuff on June 27, 2009 at 11:41 pm

I quickly wanted to shout out the latest features of groovytweets that were implemented that last couple of days:

  • RSS/ATOM feeds via Google’s Feedburner (I just realize I am using 100% Google services: Hosting, Feed Hosting, Ads…). There are two feeds available: a feed with all the latest tweets and one only with the important tweets. Important tweets are tweets that have been at least retweetet once (within the community). Feedburner also offers you subscriptions via Email based on those feeds. These feeds are refreshed every 15 minutes.
  • Retweeting of important messages. Once a message has reached the first relevancy level, the twitter user ‘groovytweets‘ is now retweeting this status. I had several iterations on this one, as it was first not quite clear what measures it takes not to disturb my own retweet counting, etc., but finally it seems to work. If you follow groovytweets on twitter, this will allow you so identify ‘trending tweets’ quickly. On the other hand the email/RSS feeds allow you to catch up once or twice a day.
  • Not exactly a feature, but groovytweets now increased the threshold to follow new people. There have to be at least 3 mentions in the public timeline of another current groovytweets friend to become a new friend. At the same time, we still accept friend suggestions (send me a regular message with <suggest @username>).
  • a couple new retweet formats were added.
  • minor changes: we have a favicon, important RSS feed is linked in HTML head, etc

Thank you all for clicking the Google Ads by the way. We got a nice click-through rate, which also made me some Euros so far. Believe me, this money will flow back into the service. We just reached about 40% of the compute allowance for one day. Especially the RSS feeds (hence memcaching the data) will eat up a lot more.

I am also thinking about giving groovytweets a proper open-source license. It is just not something I am particularly good at, so I will look into this topic soon. If there are some good tutorials/guidelines out there, please let me know. I also believe that the more abstract form of groovytweets really has some business potential, so I want to choose a license wisely.

groovytweets update 3

In General Stuff on June 21, 2009 at 8:02 pm

Here’s another update from groovytweets.

As you may know, the ‘important view‘ is now done. It works quite nicely (it shows the groovy tweets that were at least retweeted once, based on users groovytweets is following). Unfortunately the technical implementation is a bit crazy. I first wanted to get a list of Tweets where tweet.importance > 0, then sort by tweet.date or tweet.statusId (statusId is a ever increasing number and the highest is the latest one). Sounds easy…. but: GAE/J does not allow you to query for one column of the bigtable db, then sort by another. I even got a special index, but nope, it seems impossible. So I finally did this: get the last 500 tweets (yes, 500, each request to the ‘important tweets’). Then in groovy, check for importance > 1, populate view. This works quite nicely, who cares as long as it works?

Another update just implemented: scanning for new groovy users to follow. So far, the additions of ‘friends’ of groovytweets has been a manual process. I simply selected the people I thought were interesting. I began logging the @replies a couple of days a ago, which already gave me some interesting insight but there is one issue with that: Twitter disabled status updates in the timeline of a following user, if the @reply user is not itself a friend of that user. That’s a bummer, as it basically does not allow me to find new groovy users.

So what I did instead now is this: every 5 minutes, I pick two random users out of the existing friends list. I get up to 200 tweets from these users, apply the same groovy pattern matching to filter out the groovy tweets and then log the @replies in those groovy tweets.  This is running for the first cycles right now and based on the results I plan to then have another cron job (probably every hour) that checks how many times a twitter screen name was mentioned in groovy tweets. Above a certain treshold, I plant to start following that user automatically. But till then, I want to monitor this a couple of days longer.

GORM-JPA Plugin: I must admit I am stillusing plain Grails 1.1.1 and the app-engine plugin, but GORM-JPA should for sure be the future for using Grails as close as possible to the original promise (with GORM) on GAE/J. I will probably create another litle test app just for trying out GORM-JPA and then modify the existing code once I know it works fine.

groovytweets.org update 2

In General Stuff on June 17, 2009 at 10:05 pm

I just polished the look and feel of the site and created a basic layout for groovytweets.org. Also, there was a bug related to @username replies where the @ would no longer show up which is fixed now. In addition to linking @usernames from tweets, #hashtags are now linked directly to twitter search for that tag. I decided to keep those kind of links in the same font as the regular status text because I believe it distracts too much to see a link every five words in a tweet… but move your mouse over the text and you will see the pointer change.

What’s next?

  • I noticed a few retweet formats still have issues. For example multiple retweets migth not get recognized… I am not sure if I can make all those cases work, but I’ll try. Also tweets posted via twitlonger are currently not recongized as retweets and therefore do not increase the relevance of the original tweet. This should be quite easy to solve, as I can just search with the original fragment that is still contained in the retweet – assuming that such an fragment is still unique enough for a tweet lookup.
  • Special views. Views like only tweets with a higher than normal relevance or views only wiht links (or both). This will allow some of you to not waste time checking the groovy news by just concentrating on the relevant stuff.
  • RSS Feeds. Yes, of course. But I first need some more interesting views. I might just create one RSS feed to keep things simple that only publishes low-volume relevant tweets. I think this is what we all want: less noise! If you want to get the full scoop, just visit the website in this case.
  • Instead or in addition to creating a RSS feed, groovytweets could start tweeting the relevant news itself. All tweets would be retweets in this case (which should work 100% as only the tweets that were ‘retweetable’ can achieve higher relevance). This might be an interesting friend for people who really just want to get the top news each day without participating in the whole retweeting hell.
  • REST API. James Williams posted this idea to my inital announcement on the Grails Mailing list. All it would take is probably a ‘list’ api, e.g. ‘render tweets as JSON / XML’ I guess. This could then be integrated into a Griffon app, a desktop client for the groovytweets site. Although I must say once groovytweets starts tweeting the news itself, following these news becomes as easy as adding groovytweets to your existing twitter desktop client. But still I am interested in getting to know Griffon.

Any other ideas? Send me an email to hansamann (at) yahoo.de. I’ll be on vacation at Tahoe the next couple of days (but podcasting again on Saturday, so watch out for a new episode with some GAE/J content). I will reply once I am back.

p.s pls retweet!

groovytweets.org update

In General Stuff on June 16, 2009 at 11:16 pm

Since I announced groovytweets.org yesterday on the Groovy and Grails Mailing lists, I have received a lot of great feedback and great ideas to improve the service. I just finished a couple of updates, but let me first explain the idea behind groovytweets.org.

Glen and me run the bi-weekly Grails Podcast. To prepare the podcast, we have to keep track of all the relevant news over the last 2 weeks, compile it into a storybook for the show and then read and discuss it. We typically use Glen’s excellent groovyblogs.org, direct emails to grails.podcast@gmail.com and the Groovy Twitter Cleaner Pipe that I created some time ago.

The Groovy Twitter Cleaner Pipe scans the twitter universe for groovy/grails/griffon and then tries to remove all the non-important and non-groovy scripting related tweets. The pipe filters the results down, for example I remove all tweets not containing links as they have little news value (no source for further information).  The Groovy Twitter Cleaner pipe sometimes provides some great new hits, but quite often it also creates a lot of noise to unrelated tweets.  It is quite hard to filter links to groovy images (60s, etc.) and keep groovy scripting related tweets. For Grails, unfortunately there seems to be a music band, too. And for Griffon, EVERYTHING can be a griffon, really: dogs, jackets, etc.

So the issue with the Groovy Twitter Cleaner Pipe is clearly quality. Too much noise. That’s why I started the groovytweets.org. Just as groovyblogs.org, I begin with a manually selected group of ‘Groovy People’ in our community. If these people tweet about ‘groovy’, it is most likely really about the scripting language. After this was done, I thought about how I can measure relevancy. That’s where retweets come into the game. Retweets are kind of endorsements, if you retweet, it means you want to share some great content with your followers. So whenever I discover a retweet in the timeline of the groovytweets friends, I swallow the RT and instead lookup the original tweet and increase that tweets relevance.

These are the Retweet formats that are currently supported (I just updated this piece and added one new format):

  • RT @username <original>
  • RT: @username <original>
  • RT: @username: <original>
  • <original> (via @username)
  • ♺ @username: <original>
  • ♺ @username <original>

Most people will probably use RT @username: <original> which I also like the most. It immediately shows the source of the original and is in my opinion the most honest retweet.

The only issue with retweets is that sometimes the retweetet messages are too long, e.g. > 140 characters. In this case one can use twitlonger.com, and I plan to support this soon, too. But the best advice regarding the retweetability of tweets is really to keep the messages to around 120 characters, so others can actually retweet them easily.

Besides the new RT formats, these features were added today:

  • groovytweets is available at www.groovytweets.org. As naked domains are not supported right now by GAE/J domains, I have to use the www.
  • @<username> in the status text is now directly linked to the twitter profle page of that user
  • the tweet matching pattern was slightly improved, I removed /gram/ which was found too often in tweets like ‘I like programming my DVR’. I will consider using word boundaries \b if gram turns out to be a still relevant groovy technology, but I have not heard too much about it so far…
  • the still basic groovytweets UI is now not only pulling in new tweets (via AJAX), it also updates the relevance of existing tweets in the HTML page every minute.

The whole app runs on GAE/J and right now uses just about 1% of the daily compute allowance. So I got hope to keep it free, but I might consider using some google ads or so. Let’s see, this is not the most important thing right now, I am first trying to get the core service right.

I also want to upload the source code to github, just right now the groovy credentials are in and I plan to refactor the credentials into a database entity soon so I can keep the source code on github complete but leave all credentials out, too.

More about my experience with GAE/J and Grails 1.1.1 and the app-engine plugin will follow in the next podcast.

The perfect JavaOne(tm) Scripting Schedule v0.1

In General Stuff on April 28, 2009 at 7:19 pm

I finally found time to look closer at this years JavaOne(tm) sessions. Of course I had my special scripting hat on and created a scripting-optimized schedule. Groovy (including our own grailspodcast.com BOF), Scala, Ruby/JRuby are all in.

While there is no dedicated Grails session, there is the ‘Grails Integration Strategies BOF’ that all Grails lovers should visit. Followed by our grailspodcast.com Groovy & Grails (&Griffon) BOF.

Well, have a look for yourself and send me feedback and comments for alternative sessions around scripting or sessions I missed.

YouTube HD

In General Stuff on March 8, 2009 at 12:21 pm

Soon after I discovered vimeo.com, I checked YouTube for the latest news. And as it happens, they activated HD uploads just a few days ago. The annoying 10 minute limit is stil there, but files can be up to 1GB large. Not perfect, but vimeo just gives you 500 MB per week and only one HD upload. As all my videos will be HD and you can really easily hit that 500MB limit with HD videos (10 minutes was 800MB!), I took a closer look at YouTube again.

The ony freakish thing is really the YouTube address book. I just cannot figure out how to send an invite to my Gmail/Yahoo contacts, that feature does not seem to exist. Can it? This is freaking me out as I have to send around the channel address via email now and then manage the subscribers myself.

Also, it seems you have to manually select the contacts for private sharing, plus you can only share up to 25 contacts. Wow. Seems like Google does not really encourage closed communities like our family videos in this case.

Vimeo for HD videos

In General Stuff on March 6, 2009 at 11:31 am

Just discovered vimeo.com and it totally rocks. You get 500MB of upload space a week (!) and one HD upload per week. I created a small screencast in HD format to test the quality and it is truely amazing. I previously experimented with youtube for screencasting, but the quality is way too low. Also the limitation of 10 minutes is a real issue for screencasts.

My first post from Yahoo! Mail

In General Stuff on January 30, 2009 at 10:07 pm

I am just using the new WordPress application from within Yahoo Mail (beta). What is the benefit using WordPress from within Y!Mail, why not go directly to WordPress? Well, looks like I come across my Yahoo! Mail many times a day while I constantly forget to sign in at WordPress. So I might be reminded to post more often…

The next step: all kids in preschool

In General Stuff on January 27, 2009 at 9:22 pm

Beginning next month, we’ll have 3 kids in a US preschool (or jr. preschool) in the US. It means everybody in our family will soon be +2yrs and we progress to the next step. It also means we have a bit more time whenever the kids are gone, plus we have at least the good feeling that it is good for the kids to grow up bilingually (I doubt it will last long once we moved back to Germany).

We also have about one year left in the US. Maybe a bit more, who knows, but roughly one year. There are still many things we want to see and many things we want to buy (because it is cheaper here…. and who cares about a US keyboard, just drop those German Umlauts!). Let’s see if we can travel all places we want to. My wife will soon visit friends we met here in Minnesota, plus I will soon again go to Squaw, the so far best snowboarding resort I found here (I have been to Homewood, CA, which has an awesome view at Lake Tahoe and to Kirkwood, CA so far). We’ll see if we can make a final trip up to Canada, especially to Vancouver. Vancouver is the place I did my MBA, where I still got some friends and it is a place I really fell in love with. There are probably many nice cities, but having spent so much time there and ‘knowing’ a great city more than just a tourist is indispensable.

Let’s see. We also accept visitors for 2009 now. The calendar is filling for March, June and August already. Want to visit? We’re here until 2010…

Next Stop: Munich. I miss the Saunas, really.

Winter in California

In General Stuff on January 13, 2009 at 2:53 am

If you do not live in California, you might have noticed it really got cold outside. California was colder, too, the last couple of days, but again today temperatures were in the 20C  area and we were able to have lunch outside.

It is still too cold, as winter is a relative thing. If you are used Californian summers, winter is still….. cold. And I immediately got a strange throat infection that I am carrying around since a week now.

What would happen if you come from aCA summer directly into a German winter….. uuuh. Without some extra Sauna sessions I believe my body would not be ready.