Skip to main content

Positive Feedback in Social Search

One suggestion from today's social search session at #swfoo was to send queries off to both search engines and your friends (e.g., "vacations in Venice").  A problem here is that many of your friends are incompetent about vacations in Venice, so sending them this both spams them and decreases results relevancy -- noise increases linearly with overall size of system.  This is why the good results that early adopters with 20K followers have with "what's the best pizza in Sebastopol" aren't scalable.

But, there's a nice solution to this I think.  As you do get results that are somewhat relevant from friends, you click through on their answers.  Your clicks tell the system that friend's answer was relevant in context, allowing it to learn which friends are competent in various fields.  Combine these results across everyone who is asking questions of the same friends to cancel out bias; you're left with a vector of weights for each person in the network, one weight per field of expertise.  Use this to do a few things:
  • Explicit reputation for people who answer, to accompany the implicit social debt incurred
  • Rank their answers higher in search results -- in many cases beating out traditional search engines if they're proved to be less competent
  • Don't spam incompetent people with questions they can't answer
  • Potentially, reach beyond your immediate social network to find the real experts on the subjects and send your question to them.
This is much more scalable than trying to categorize your friends explicitly as experts in various areas.  You'll still do this implicitly, by first clicking on results from friends you already know to be expert, helping to bootstrap the system.  But you never need to know you're doing this; the system learns automatically.

Popular posts from this blog

The problem with creation date metadata in PDF documents

Last night Rachel Maddow talked about an apparently fake NSA document "leaked" to her organization.  There's a lot of info there, I suggest you listen to the whole thing:

http://www.msnbc.com/rachel-maddow/watch/maddow-to-news-orgs-heads-up-for-hoaxes-985491523709

There's a lot to unpack there but it looks like somebody tried to fool MSNBC into running with a fake accusation based on faked NSA documents, apparently based on cloning the document the Intercept published back on 6/5/2017, which to all appearances was itself a real NSA document in PDF form.

I think the main thrust of this story is chilling and really important to get straight -- some person or persons unknown is sending forged PDFs to news organization(s), apparently trying to get them to run stories based on forged documents.  And I completely agree with Maddow that she was right to send up a "signal flare" to all the news organizations to look out for forgeries.  Really, really, really import…

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.

Twister is interesting.  It's a decentralized "microblogging" system based on putting together existing protocols:  Bitcoin, distributed hash tables, and Bittorrent.  The most interesting part for me is using Bitcoin for user registration and spam control.  Federated systems handle this with federated trust, which is at least conceptually simple.  The Twister/Bitcoin mechanism looks intriguing though I don't know enough about Bitcoin to really comment.  Need to read further.