For those of you that like to follow social media sites such as Digg, an easy analysis tool may be of some use to you. Yahoo Pipes lets you very quickly put together a suite of tools to organize a web feed’s items. In this example, I’m going to to sort the Digg homepage RSS feed by the submitter of each story.
To do that, we need to manipulate some of the content of the Digg feed using the Yahoo Pipes Regex (regular expressions) module. Otherwise, all the information we need is in the feed.
Regular expression patterns:
I’m not going to get into an elaborate discussion of regexes. Instead, I’ll just list what I’ve used in the screencast video. (If you’re familiar with regexes already, bear with me.)
- ^ - caret - match the beginning of a string.
- $ - dollar - match the end of a string.
- .* - dot star - match any sequence of characters.
- ^.*$ - match the entire string.
- (^.*$) - match the entire string and save it in parameter 1, aka $1.
Digg feed variables used:
The Digg home page RSS feed has a number of fields/ variables that we can access in Yahoo Pipes. In this example, I’ve only used one:
digg:submitter.digg:username
Within Yahoo Pipes, to access it, we place braces (curly brackets) around it:
${digg:submitter.digg:username}
Process:
These are the steps I take in the video below.
- Grab the Digg home page feed.
- Insert the digg username (of the story submitter) in the item.title field’s values, at the beginning of the title, surrounded by square brackets.
- Do the same with the item.y:title field. (This is probably redundant, but it’s not a big deal.)
- Replace the item.description fields with nothing - i.e., an empty string. For our analysis, getting rid of the description reduces visual clutter in the results. It’s just easier to see only the title and submitter.
- Sort the resulting manipulated feed by the item.title.
What we’re doing is taking a story title such as
Paris’ Sob Story
with
[RainbowPhoenix] Paris’ Sob Story
for each home page story. The string in the square brackets is the name of the Digg member that submitted the article. So ^.*$ matches “Paris’ Sob Story”, and the () brackets assigns this string to $1. Thus the Regex replace rule (^.*$) for item.title takes the very same title and inserts the current digg username in square brackets in front of the title.
[${digg:submitter.digg:username}] $1
Other than getting rid of the story description, this all we’re really doing, followed by a sort on the title values.
Yahoo Pipes modules used:
- Fetch Feed
- Regex
- Sort
- Output
Here’s a SplashCast screencast showing the process of creating the Pipe. (Apologies for the choppy narration, as I had to use an earlier voiceover due to upload problems.)
You can take my Digg by Submitter pipe, clone and tweak it to your heart’s content. Or wait for the next one. In the next part of this mini-series, we’ll sort the Digg homepage by category (and prove an Apple bias for the home page).
Vote for this post : 0
or Buzz it at Yahoo :








Comments
5 responses so far ↓
Tyler Dewitt on Jun 9, 2007 at 6:07 am
neato, whats this do though? I mean can you do some crazy stuff by knowing all this?
Raj Dash on Jun 9, 2007 at 10:01 am
Tyler: Well I’m leading up to that. Obviously, when it comes to metrics, one metric that’s useful to one SEO isn’t to another. This is an example of Pipes functionality. It lets you quickly see which Digg member is getting on the home page the most in a given day. The next few examples look at other Digg information (comments, votes, categories, etc.)
So you can quickly pick up the Pipes functionality from these examples and analyze the feeds that are important to you.
Raj Dash on Jun 10, 2007 at 2:49 am
Note: I realized afterwards that the regex rule (^.*$) could probably have been just ^
I haven’t tried it, but it should work.
Motorcycle Guy on Jun 11, 2007 at 7:24 am
I haven’t tried pipes out, but i constantly hear about it. It looks really neat though.
Rose Oil on Jun 12, 2007 at 3:04 pm
Looks like this article made it to the digg frontpage
Leave a Comment