Ask Your Question

Algorithm used to show related questions

asked 2012-07-06 23:27:21 -0600

kintali gravatar image

What algorithms are being used in askbot to show the "related questions" (1) in the right panel and (2) below the title when you are adding a new question ? Are they simply based on common tags and common strings in the body of the questions ? For (2) Does it keep track of which questions are being "clicked" by users ?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2012-07-08 20:34:01 -0600

Evgeny gravatar image

It's in askbot.models.question.Thread.get_similar_threads.

First up to 100 questions with matching tags are selected, then similarity is calculated as number of overlapping tags, then 10 most similar threads are shown.

Not a rigorous algorithm at all, maybe you could suggest something better?

The algorithm should be either fast enough to generate the list in real time or we'd need to denormalize the list and recalculate periodically. Now it is not too slow and the result is stored in the cache so we won't need to do that computation every time.

edit flag offensive delete link more


I can help you implement a better algorithm once you pick a search backend.

Joseph gravatar imageJoseph ( 2012-07-13 15:01:08 -0600 )edit

I can see that Xapian readily provides functionality for finding a set of documents similar to a given one; I guess Lucene would have something equivalent. See: [], []. We would just need to fiddle with the factors specific to a QA forum, (namely the relative weights for title/tag/question/answer terms) to try to optimize the relevance.

Basel Shishani gravatar imageBasel Shishani ( 2012-08-24 03:53:15 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools


Asked: 2012-07-06 23:27:21 -0600

Seen: 496 times

Last updated: Jul 08 '12