摘要
Entityresolution(ER)aimstoidentifywhethertwoentitiesinanERtaskrefertothesamereal-worldthing.CrowdERuseshumans,inadditiontomachinealgorithms,toobtainthetruthsofERtasks.However,inaccurateorerroneousresultsarelikelytobegeneratedwhenhumansgiveunreliablejudgments.PreviousstudieshavefoundthatcorrectlyestimatinghumanaccuracyorexpertiseincrowdERiscrucialtotruthinference.However,alargenumberofthemassumethathumanshaveconsistentexpertiseoverallthetasks,andignorethefactthathumansmayhavevariedexpertiseondifferenttopics(e.g.,musicversussport).Inthispaper,wedealwithcrowdERintheSemanticWebarea.WeidentifymultipletopicsofERtasksandmodelhumanexpertiseondifferenttopics.Furthermore,weleveragesimilartaskclusteringtoenhancethetopicmodelingandexpertiseestimation.WeproposeaprobabilisticgraphicalmodelthatcomputesERtasksimilarity,estimateshumanexpertise,andinfersthetasktruthsinaunifiedframework.Ourevaluationresultsonreal-worldandsyntheticdatasetsshowthat,comparedwithseveralstate-of-the-artapproaches,ourproposedmodelachieveshigheraccuracyonthetasktruthinferenceandismoreconsistentwiththehumanrealexpertise.
出版日期
2018年06月16日(中国期刊网平台首次上网日期,不代表论文的发表时间)