Skip to content

TZstatsADS/Spr2017-proj4-team2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

162 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Project 4: Who Is Who -- Entity Resolution

Term: Spring 2017

  • Team 2
  • Projec title: Who is Who -- Author Disambiguation
  • Team members
    • Jiahui Tan
    • Ruxue Peng
    • Jiahao Zhang
    • Xuanzi Xu
    • Tongyue Liu
  • Project summary: In this project, we implement two algorithms on Author Disambiguation: (1) Naive Bayes method proposed by "Two Supervised Learning Approaches for Name Disambiguation in Citations" (Han,2004) and (2) Agglometative Clustering proposed by "Author Disambiguation Using Error Driven Machine Learning with a Ranking Loss Function". We further evaluate the prediction accuracy, algorithm sensitivity as well as implementation easiness of both methods respectively. Due to the limitation of computational capacity, we made some reasonable modification to the original algorithm suggested in paper 5 (Culotta 2007). Suggestions for further improvement are also provided.

Contribution statement: (default)

Paper 2(Naive Bayes) is implemented by Xuanzi Xu and Tongyue Liu
Paper 5(Error Driven) is implemented by Jiahui Tan, Ruxue Peng and Jiahao Zhang

All team members contributed equally in all stages of this project. All team members approve our work presented in this GitHub repository including this contributions statement.

Following suggestions by RICH FITZJOHN (@richfitz). This folder is orgarnized as follows.

proj/
β”œβ”€β”€ lib/
β”œβ”€β”€ data/
β”œβ”€β”€ doc/
β”œβ”€β”€ figs/
└── output/

To reproduce the result, please first go to doc subfolder for a README file.

About

Spr2017-proj4-team2 created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors