SEO

Yahoo Pipes: Analyzing Digg, Part 1: By Submitter

For those of you that like to follow social media sites such as Digg, an easy analysis tool may be of some use to you. Yahoo Pipes lets you very quickly put together a suite of tools to organize a web feed’s items. In this example, I’m going to to sort the Digg homepage RSS feed by the submitter of each story.

To do that, we need to manipulate some of the content of the Digg feed using the Yahoo Pipes Regex (regular expressions) module. Otherwise, all the information we need is in the feed.

Regular expression patterns:
I’m not going to get into an elaborate discussion of regexes. Instead, I’ll just list what I’ve used in the screencast video. (If you’re familiar with regexes already, bear with me.)

  1. ^ – caret – match the beginning of a string.
  2. $ – dollar – match the end of a string.
  3. .* – dot star – match any sequence of characters.
  4. ^.*$ – match the entire string.
  5. (^.*$) – match the entire string and save it in parameter 1, aka $1.

Digg feed variables used:
The Digg home page RSS feed has a number of fields/ variables that we can access in Yahoo Pipes. In this example, I’ve only used one:

digg:submitter.digg:username

Within Yahoo Pipes, to access it, we place braces (curly brackets) around it:

${digg:submitter.digg:username}

Process:
These are the steps I take in the video below.

  1. Grab the Digg home page feed.
  2. Insert the digg username (of the story submitter) in the item.title field’s values, at the beginning of the title, surrounded by square brackets.
  3. Do the same with the item.y:title field. (This is probably redundant, but it’s not a big deal.)
  4. Replace the item.description fields with nothing – i.e., an empty string. For our analysis, getting rid of the description reduces visual clutter in the results. It’s just easier to see only the title and submitter.
  5. Sort the resulting manipulated feed by the item.title.

What we’re doing is taking a story title such as

Paris’ Sob Story

with

[RainbowPhoenix] Paris’ Sob Story

for each home page story. The string in the square brackets is the name of the Digg member that submitted the article. So ^.*$ matches “Paris’ Sob Story”, and the () brackets assigns this string to $1. Thus the Regex replace rule (^.*$) for item.title takes the very same title and inserts the current digg username in square brackets in front of the title.

[${digg:submitter.digg:username}] $1

Other than getting rid of the story description, this all we’re really doing, followed by a sort on the title values.

Yahoo Pipes modules used:

  1. Fetch Feed
  2. Regex
  3. Sort
  4. Output

Here’s a SplashCast screencast showing the process of creating the Pipe. (Apologies for the choppy narration, as I had to use an earlier voiceover due to upload problems.)

yahoo pipes digg by submitter Yahoo Pipes: Analyzing Digg, Part 1: By Submitter

You can take my Digg by Submitter pipe, clone and tweak it to your heart’s content. Or wait for the next one. In the next part of this mini-series, we’ll sort the Digg homepage by category (and prove an Apple bias for the home page).

Comments are closed.

6 thoughts on “Yahoo Pipes: Analyzing Digg, Part 1: By Submitter

  1. Tyler: Well I’m leading up to that. Obviously, when it comes to metrics, one metric that’s useful to one SEO isn’t to another. This is an example of Pipes functionality. It lets you quickly see which Digg member is getting on the home page the most in a given day. The next few examples look at other Digg information (comments, votes, categories, etc.)

    So you can quickly pick up the Pipes functionality from these examples and analyze the feeds that are important to you.

  2. Note: I realized afterwards that the regex rule (^.*$) could probably have been just ^

    I haven’t tried it, but it should work.