Browse By

IDEA #45 – Social Connections Search Engine

David Cohen shared a few ideas with me. There was one in particular I thought was pretty good and thought I’d share with you:

Build a site that shows you the social connections of any person. This thing would have to crawl the web looking for microformats, blogrolls, blog posts, etc in order to automatically build a list of relationships. Each relationship would have a strength based on how often the two people refer to each other, etc. This could be used to figure out how to get to someone from a networking perspective.

Once again, I see this having characteristics similar to Google’s PageRank system — the software would look at an individual’s blog and see how many times they reference anyone (or any website), then based on that number, there would be a value assigned to someone they reference (or a website they reference). Thus, if I had 25 outbound links on my blog, and 5 of them were to TechCrunch — then I’d have a very strong relationship (admiration/love or possibly mutual) with it; if TechCrunch linked back to me quite a bit based on the rest of their website links, we’d be able to tell whether it was a mutual relationship/friendship.

There could then be a website that offers up a personalized “news” webpage based on a specific website — it would display posts, recent bookmarks, etc, from the assumed friends/relationships that a website has with other websites. Thus, if you visited this proposed page for Techquila Shots, you may see recent TechCrunch posts, NetNagel posts, my recent bookmarks, recent bookmarks from others I’ve previously linked to, etc.

“Social Search” (or “personalized search”) is the next big thing in Search — I consider and some others to already be social search engines, but there’ll be one that’ll become huge enough and challenge Google. Could this be it? Likely not, but feel free to chime in with any comments or other ideas you have to extend this idea.

Hint: If you are an engineer with a background in search, this might be an idea to pitch to TechStars, considering David Cohen (one of the guys behind TechStars) is the one that shared this idea :)

  • maleksh da3wa

    this one reserves a dig

  • maleksh da3wa

    ok… the idea is great, but also you might be in business with Mike Arrington 😉 and keep linking to each others, then you are on the top of the list?

  • James D Kirk

    How quickly we forget! I posted this back in February, and pinged this site in the article!

    Maybe I should use it as an idea for TechStars!

  • alex

    How do you determine an identity? If you link to TechCrunch, are you in fact linking to Mike Arrington, or to his blog (which consists of a number of contributors)?

  • David Cohen

    Captain Kirk – we should talk!

    Maleksh – It’s not a popularity “list” that would be valuable, but rather the ability to search on a person (or between people) to better understand the social or professional ties that exist between them.

  • Chris Saad

    The problem with this idea is mapping different instances of people to the same ‘person’.

    I.e. A blog + Myspace Account + Email Address + hCard all might represent 1 person – but how do you automatically correlate them.

  • David Cohen

    Chris – that would be part of the magic, you’re right. I think if someone gets into this, they’ll find that there are some “very high confidence” correlations (such as when somebody refernces their linkedin profile on their blog) and then some possible matches. I think for the system to be successful, it would need to only count on the high confidence stuff, and be able to expand the result to include less confidently held information upon request.

  • Tom C.

    There are certain bloggers whose ideas/opinions/sentiment/notions I respect and most often I find myself trying to figure out their sphere of influence. This search engine will definitely give me a comprehensive view of their sphere of influence. Since the quality of their relationships is not really important in this case (to begin with), I think a simple “degree centrality” algorithm that runs over the crawled links should be sufficient to start with. However, the degree centrality algorithm would have to consider not just links (as the graph edges connecting the graph nodes) but 1) links, 2) mentions and 3) inter-blogger participation (e.g. the way David Cohen is participating on this Techquila response thread) to measure the degrees. Once the social network graph is established it can he mined for other data of data such as identifying bloggers who are viewed as experts on particular topics and so on. An enhancement that I can also think of, for maybe version 2.0, is a simple sentiment analyzer that can be used to score “the love” between bloggers and address “quality of their relationships” that I mentioned above.
    Regarding, problems with identifying web presence instances; I think that’s a problem that can be solved iteratively; its not showstopper.

    I like this idea.

  • Graydon

    I think the visual element would be important when dealing with some individuals to really decipher the “sphere of influence” as Tom says…

    I think something like liveplasma would be good…

    What might be a good launching point is to start at the site to site interaction level (re: )
    and then “learn” more of the actual person to person items and incorporate and expand.