Sample document corpus
From MathWeb
We plan to build a corpus of semantically annotated sample documents about various mathematical topics, which can be used to evaluate knowledge management tools.
Formality
Depending on the type of tool, these documents can range from informal ones (e.g. to do natural language processing) to formal ones (e.g. to evaluate automated theorem provers or development graph managers).
Validity/Soundness
Depending on the type of case study, the contents of these documents can be valid or sound, or not. For managing documents, a theorem may be proved by having a "trivial" proof point to it, if the software is satisfied with the structural constraint that there is a proof. For testing automated theorem provers, a proof can be wrong.

