Yahoo Pipes: Analyzing Digg, Part 1: By Submitter

SMS Text

For those of you that like to follow social media sites such as Digg, an easy analysis tool may be of some use to you. Yahoo Pipes lets you very quickly put together a suite of tools to organize a web feed’s items. In this example, I’m going to to sort the Digg homepage RSS feed by the submitter of each story.

To do that, we need to manipulate some of the content of the Digg feed using the Yahoo Pipes Regex (regular expressions) module. Otherwise, all the information we need is in the feed.

Regular expression patterns:
I’m not going to get into an elaborate discussion of regexes. Instead, I’ll just list what I’ve used in the screencast video. (If you’re familiar with regexes already, bear with me.)

  1. ^ – caret – match the beginning of a string.
  2. $ – dollar – match the end of a string.
  3. .* – dot star – match any sequence of characters.
  4. ^.*$ – match the entire string.
  5. (^.*$) – match the entire string and save it in parameter 1, aka $1.

Digg feed variables used:
The Digg home page RSS feed has a number of fields/ variables that we can access in Yahoo Pipes. In this example, I’ve only used one:


Within Yahoo Pipes, to access it, we place braces (curly brackets) around it:


These are the steps I take in the video below.

  1. Grab the Digg home page feed.
  2. Insert the digg username (of the story submitter) in the item.title field’s values, at the beginning of the title, surrounded by square brackets.
  3. Do the same with the item.y:title field. (This is probably redundant, but it’s not a big deal.)
  4. Replace the item.description fields with nothing – i.e., an empty string. For our analysis, getting rid of the description reduces visual clutter in the results. It’s just easier to see only the title and submitter.
  5. Sort the resulting manipulated feed by the item.title.

What we’re doing is taking a story title such as

Paris’ Sob Story


[RainbowPhoenix] Paris’ Sob Story

for each home page story. The string in the square brackets is the name of the Digg member that submitted the article. So ^.*$ matches “Paris’ Sob Story”, and the () brackets assigns this string to $1. Thus the Regex replace rule (^.*$) for item.title takes the very same title and inserts the current digg username in square brackets in front of the title.

[${digg:submitter.digg:username}] $1

Other than getting rid of the story description, this all we’re really doing, followed by a sort on the title values.

Yahoo Pipes modules used:

  1. Fetch Feed
  2. Regex
  3. Sort
  4. Output

Yahoo Pipes - digg homepage sorted by submitter

You can take my Digg by Submitter pipe, clone and tweak it to your heart’s content. Or wait for the next one. In the next part of this mini-series, we’ll sort the Digg homepage by category (and prove an Apple bias for the home page).

Subscribe to SEJ!
Get our weekly newsletter from SEJ's Founder Loren Baker about the latest news in the industry!
  • Tyler Dewitt

    neato, whats this do though? I mean can you do some crazy stuff by knowing all this?

  • Raj Dash

    Tyler: Well I’m leading up to that. Obviously, when it comes to metrics, one metric that’s useful to one SEO isn’t to another. This is an example of Pipes functionality. It lets you quickly see which Digg member is getting on the home page the most in a given day. The next few examples look at other Digg information (comments, votes, categories, etc.)

    So you can quickly pick up the Pipes functionality from these examples and analyze the feeds that are important to you.

  • Raj Dash

    Note: I realized afterwards that the regex rule (^.*$) could probably have been just ^

    I haven’t tried it, but it should work.

  • Motorcycle Guy

    I haven’t tried pipes out, but i constantly hear about it. It looks really neat though.

  • Rose Oil

    Looks like this article made it to the digg frontpage

  • Karen Newton

    Love your pipe! There’s so much you can do with Yahoo! Pipes, it’s a lot of fun exploring.