What’s in a GitHub wiki?

GitHub is the world’s largest online platform for developers to create and share their projects.

Even though every GitHub repository comes equipped with a wiki section for storing documentation, little has been done to analyse the content of GitHub wiki pages. Furthermore, wiki pages for software projects often contain outdated and incomplete documentation. 

The goal of this project is to gain insights into the types of content software developers typically store in GitHub wiki pages and design a machine learning classifier that can automatically classify the pages into different categories (e.g. project information, setup instructions, issues and bugs). This project may ultimately lead to future works such as a recommender system which can notify software developers when documentation is out of date, and make recommendations about what should be updated.

Transforming technologies


Computer Science

Wen Siang Tan

vote for this project: TT76

Back to project list