Markup platform

MARKUS is a markup platform developed by Brent Hou Ieong Ho. it was first designed to automate the markup of different kinds of named entities (personal names, place names, temporal references, and official titles). It has developed into a multi-faceted tool that allows users to access a range of online reference tools while reading texts in classical Chinese, and/or to tag and extract any kind of information of interest to them. In addition to names already present in China Biographical Databate and China Historical GIS, users can tag words or expressions by uploading their own lists or by using regular expressions. MARKUS contains a small selection of regular expressions that could be useful, for example, for the analysis of quotations and citations. Users can also design their own regular expressions and tag the results. All markup can be edited while consulting the reference sources integrated in the platform. The final results can be exported for further analysis in a planned visualization platform and in other tools.

MARKUS currently exists in beta. We continue to improve the accuracy of the automated markup and we are also adding on additional modules. The further development of MARKUS will be undertaken jointly with partners at Harvard University, University of Birmingham, National Taiwan University, and Academia Sinica. This further development is at the core of Automating Data Extraction from Chinese Texts, a project that has been made possible through the Digging into Data Challenge.


Visualization platform

We have started work on a platform that allows users to analyze texts that have already been marked up. This platform contains table, pie chart, and bar graph views for quantitative analysis and network and map visualizations for relationship and geographic data. This system too is integrated with China Biographical Databate and China Historical GIS, allowing users to include data from these large databases in their analysis.

CBDB Mac Dictionary

The CBDB Mac Dictionary is an alternative way to search CBDB (China Biographical Database). It can be used on Apple computers without internet access. It allows users to search CBDB by either using Mac OS X integrated Spotlight or by selecting any name or partial name in any programme (Look up in dictionary function).



CBDB Name/Title Lookup

CBDB Name/Title Lookup is a free online service for searching personal names in CBDB (China Biographical Database) when only part of the name and an official title or part thereof are known. The system will search the biographical data of all 328,000 entries in CBDB. It features a faceted interface to help narrow down candidates quickly and detailed biographical data can also be checked directly.


The Sieve Online

The Sieve Online is an online service of the Sieve. The Sieve allows researchers to assess audience perception of classical or vernacular texts of various kinds. The Sieve uses a set of elementary textbooks as a benchmark to determine the vocabulary recognisable to an audience with a limited level of literacy. It can also be used to compare two texts. To use the tool, upload two texts or two sets of texts. The first text or set of texts will be used as a reference point to which the second text will be compared. The Sieve analyses both texts and differentiates shared and unique characters. It is also possible to look up any character using the dictionary tool (click on a character and the definition will pop up).


Recent blog posts

International Medieval Congress 2015 by mchu, July 30, 2015, 3:11 p.m.

Team members Hilde De Weerdt, Chu Mingkin and Julius Morche contributed to the panel “Historical Knowledge Networks in Global Perspective” more

MARKUS update and new tools by hweerdt, March 12, 2015, 6:38 a.m.

The MARKUS tagging and reading platform has gone through a major update. New features are more

Away day for the "State and society network" at LIAS by mchu, Dec. 5, 2014, 12:40 p.m.

Team members Hilde De Weerdt, Julius Morche and Chu Ming-kin participated in the Away Day of the “state and society more

See all blog posts

Recent Tweets