mahout recommendations on two event on similar item -
mahout recommendations on two event on similar item -
i trying solve problem on mahout. question have users , courses, user can view course of study or can take course. if user viewing course of study have recommend take course. have info userid , itemid , there no preferences associated with. ex:
1 2
1 7
2 4
2 8
3 5
4 6
where in first column 1 userid , in 2nd column 2 course of study id.the twist in 2nd column can hold both viewed or/and finish of particular course.suppose coursea viewed has id 2 , same coursea taken has id 7 user 1. if user other user 1 coming , viewing coursea have predict courcea taken.now problem here if user viewing course of study not taking it, user based recommendation in mahout failed.because business perspective have give them course of study viewing should taken. need factorize dataset here or algo best suitable kind of problem.
one problem viewing may not predict (and won't predict well) user wants take course. should @ new cross-cooccurrence recommender stuff in mahout v1. it's part of finish revamp of mahout on spark using new scala dsl , built in optimizer linear algebra. command line job looking spark-itemsimilarity , can ingest user , item ids straight without translating them cardinal non-negative numbers.
the algo takes actions know want recommend (user takes course) these strongest "indicators" can used in recommender. finds correlated views, views led user taking course. done spark-itemsimilarity job, can take 2 actions @ time finding correlations, filtering out noise, , producing 2 "indicators". job 2 sparse matrices, each row item "user takes course" action dataset , values ordered list of item ids similar. first output items similar other peoples taking course, sec items similar other people viewing , taking course.
input uses application specific ids. can leave info mixed if include filter term ids action. looks like:
user-id-1,item-id1,user-took-class user-id-1,item-id2,user-viewed-class-page user-id-1,item-id5,user-viewed-class-page ...
the output text delimited (think csv can command format) , item-id tokens default looks this:
item-id-1,item-id-100 item-id-200 item-id-250 ...
this item id, comma, , ordered list of similar items separated spaces. index search engine , utilize current user's history of action 1 query against primary indicator , user's history of action 2 against secondary cross-cooccurrence indicator. these can indexed 2 fields of same doc there 1 query against 2 fields. gives server scalable solr or elasticsearch. create info models mahout index , query them search engine.
mahout docs:http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html preso on theory , other things can these techniques: http://www.slideshare.net/pferrel/unified-recommender-39986309
using technique can take virtually entire user clickstream recorded separate actions , utilize them create improve recs. actions don't have on same items. can utilize user's search term history, instance, , cross-cooccurrence indicator. in case output have search terms lead users take course of study , query current user's search term history.
mahout
Comments
Post a Comment