SWiM/discussions

Brainstorming: how to use constructive feedback from discussions about structured domain knowledge (here: from mathematics, written in OMDoc) in order to improve the latter.

Possible Targets

 * WikiSym conference, deadline Sat May 3
 * SCooP workshop, deadline May 30 (abstract May 16)
 * LWA/FGWM, deadline June 6

Outline

 * general abstract: structured domain knowledge (e.g. in a wiki), discussions about that, making use of constructive discussion posts in order to trigger improvements to the domain knowledge.
 * (write a bit more on the general problem? What do you think? Note that I don't have too much background knowledge about discussion/argumentation ontologies, I could not do that. --Christoph 17:32, 28 April 2008 (CEST))
 * from a wiki point of view (targetting WikiSym), our main thrust is:
 * wikis host domain knowledge and discussions about that knowledge
 * many semantic wikis improve the management of domain knowledge
 * most semantic wikis neglect discussions, hardly any semantics for that
 * but there is SIOC and there are argumentation ontologies for making discussions semantic
 * now we're going to improve a (semantic) wiki by utilising SIOC
 * high-level intro to some use cases from mathematics (see below)
 * technical foundations: semantic wiki SWiM
 * wiki pages are documents semantically marked up in OMDoc, RDF extracted from this markup (cf. SALT, similar approach)
 * here: wiki page and knowledge item contained on that page is considered the same (has some disadvantages, but the advantage is that it makes modeling easier)
 * one discussion page per wiki page (a SIOC forum)
 * discussion posts can additionally be typed with a concrete type of issue, of a request for improving the associated knowledge item
 * just having question/comment/criticism (see image) is insufficient, as for a certain type of knowledge item (e.g. the definition of a mathematical symbol) there can be more than one type of objection and more than one way of improving the knowledge item
 * good semantic web infrastructure available: Jena, SPARQL, optionally Pellet (OWL-DL)
 * discuss prototypical implementation
 * evaluation: needs to be done, for now probably give some hints
 * related work (see below)
 * further directions, future work, outlook

Participants
Enter yourself here:


 * Christoph, providing domain background and use cases (mathematical knowledge management) and technical platform SWiM, semantic wiki expert
 * …, SIOC expert
 * …, expert for argumentation ontologies
 * Tuukka, some SIOC and wiki experience

Advisors so far:
 * John, senior advisor w.r.t. SIOC
 * Uldis, advisor w.r.t. SIOC
 * Tudor, advisor w.r.t. argumentation

Modeling the discussion

 * SIOC is suitable for
 * representing posts and pointing to the topics the posts reply to (can be original knowledge items or other posts)
 * representing users who made posts
 * actually also for representing wiki pages and their versions
 * but not for the knowledge contained in the wiki pages
 * and not for the semantics of posts w.r.t. speech acts or argumentation
 * Try to get constructive feedback about a knowledge item
 * not just positive vs. negative, but also "unclear", "incomprehensible", "nice, but …"
 * depending on the type of the knowledge item, there just isn't the default action to improve things if it gets negative votes, there are multiple dimensions of criticism and multiple ways to resolve it
 * TDL (Thread Description Language): represent agreement/disagreement of posts with other posts (here: also knowledge items)
 * Vote Links Microformat: for/against/abstain
 * Speech acts
 * classify: is it a question, answer, comment, and what is the intention of the post
 * Argumentation
 * Tudor: is orthogonal to speech acts
 * Argue why is the knowledge item questioned, why should it be reorganized
 * SALT full
 * IBIS
 * ArgDF

Approach
There are linked items of mathematical knowledge (here: written in OMDoc), such as


 * symbol declarations
 * definitions
 * axioms
 * assertions: theorems, lemmas, corollaries, hypotheses, …
 * proofs
 * examples
 * notation definitions for symbols

Every one of them is considered a RDF resource (here: represented on its own semantic wiki page).

Now look at the discussion pages (in fact, one SIOC forum per wiki page):
 * 1) users report issues with the knowledge item (e.g. “I don't understand this”)
 * 2) For any such issue, a solution can be proposed (e.g. “provide an example”)
 * 3) For any such proposed solution, users can state agreement or disagreement
 * 4) The most popular solutions are counted, and actions are inferred (e.g. asking other authors to restructure or explain something)
 * 5) issue is marked as solved

Pre-Survey
We know what types of mathematical knowledge items exist (from OMDoc), can rely on that, as OMDoc and similar approaches are well-accepted. But we know much less about:


 * what can be wrong with a knowledge item (e.g. incomprehensible)
 * how can it be solved (e.g. by providing an example)

Conduct a survey among several mathematicians (e.g. within the JEM project). For every type of knowledge item, provide a few problems and solutions that we believe to be common and let the volunteers
 * select some of them
 * or suggest additional ones

/questionnaire (feed this into a software like LimeSurvey: http://www.limesurvey.org)

Ontology/Rules
Formalize the results of the survey into rules.

Start:
 * assertion
 * problem: incomprehensible
 * solution (if no example exists): provide example
 * solution (if examples exist): provide an example that's more appropriate for the needs of the reader (but here we'd need more feedback on the existing examples!)
 * example
 * problem: inappropriate for domain of the reader
 * solution: provide a second example
 * notation
 * problem: inappropriate
 * solution: provide alternative notation for the same symbol
 * problem: hard to read
 * solution: provide alternative notation for the same symbol

Requirements

 * 1) ontology representing types of knowledge items and links between them
 * 2) ontology representing discussions associated to knowledge items (and probably pointing to other knowledge items) -- SIOC
 * 3) note: SIOC doesn't provide types of discussion posts such as "question" or "complaint", neither does it model things like "I (dis)agree because (pointer to knowledge item)" or "I suggest the following action be taken"
 * 4) * need speech acts and/or argumentation (see above)
 * 5) * idea: force people not to post meaningless untyped discussion posts but to explicitly state (dis)agreement: “People are already experimenting with new social machines for online peer review, while other tools such as chat rooms developed quite independently and before the Web. […] By experimenting with these structures, we may find a way to organise new social models that not only scale well, but can be combined to form larger structures. […] I’d always been frustrated that the essential role of a message in an argument was often lost information. […] We created a sub-directory called Discussion [… that] allowed people to post questions on a given subject, read and respond. A person couldn’t just ‘reply’. He had to say whether he was agreeing, disagreeing or asking for clarification of a point. The idea was that the state of the discussion would be visible to everyone involved.” – Tim Berners Lee, “Weaving the Web”
 * 6) * one step further: force people to give more detailed feedback, i.e. as stating as exactly as possible what they think is wrong and needs to be improved
 * 7) some way to infer the additional properties which the knowledge items obtain from the discussion posts
 * 8) * e.g. "any mathematical concept that has at least three questions or comments on its discussion page and that does not have at least one/two examples is difficult" (next step: customise this number)
 * 9) ** including a condition when the problem is considered to be solved (here: an example is available) is necessary for termination!
 * 10) ** sure, this is superficial, but better than nothing!
 * 11) * rules
 * 12) * or DL axioms: make knowledge item an instance of a class $$\text{MathConcept}\sqcap\exists\text{ikewiki:hasDiscussion}.(\text{sioc:Forum}\sqcap\ge 3\,\text{sioc:container-of}.(\text{sioc:Post}\sqcap\dots))\sqsubseteq\text{HardStuff}$$ (note: this is a concrete example how it would look in SWiM. In the current knowledge model of SWiM, the resource described on a wiki page and the wiki page are considered the same thing.)
 * 13) ** to do: add "not having an example" here!
 * 14) ** note, here we just count the number of posts. The $\dots$ need to be replaced by a particular type of post, e.g. “I don't understand this, it should be explained”
 * 15) * or probably for a prototype just do some hard-coded queries (can we express counting in subqueries in SPARQL? with JENA extensions provided?)
 * 16) some actions that can then be taken, using these properties, e.g.:
 * 17) * a dynamic, user-centered approach: identify main authors of affected knowledge items and somehow "ask" them to take action
 * 18) * a dynamic, content-centered approach: auto-create a new knowledge item (e.g. an example for a proof, or a new notation for a symbol) that contains as text a justification why it was created (because the corresponding theorem was too difficult) and asks the reader to actually provide the example. (This is "the wiki way": assume every reader to eventually become a writer!)
 * 19) * a static, content-centered approach: just tag the knowledge item and based on that tag …
 * 20) ** on the pages hosting the knowledge items affected, place a "red link" leading to the (template-based) creation of the knowledge item that would solve the problem
 * 21) ** on some entry page or project-management page, auto-create a to-do list with an inline query over all tagged knowledge items
 * 22) * all approaches can be combined!
 * 23) * if the "solution knowledge item" is auto-generated, consider adding a justification there to inform the readers/editors: "This was created because of that discourse."

Implementation
Some feedback from Tuukka and Uldis:
 * discussion UI: show text "Issue", "Idea", ... in the head of the table of a post
 * make buttons more descriptive
 * provide text template for the post, as "steps for reproduction" in Bugzilla
 * assist in solving problems by executing ideas
 * assist in closing tickets: If system thinks that issue is solved, offer posting a decision

Proof/theorem explanation

 * 1) background: In OMDoc, one can write proofs for assertions (more specific: theorems, lemmas, corollaries, …), and one can explain assertions by examples.
 * 2) starting point: a theorem and its proof
 * 3) the proof or the proof is too formal or too hard to understand
 * 4) readers complain about this or ask questions
 * 5) * on the discussion page of the proof
 * 6) * or they complain on the discussion page of the theorem that they don't understand the theorem
 * 7) * both cases are likely; both can be solved by providing an appropriate example for the (application of) the theorem

Symbol notations
Assume a symbol has been defined and its semantics is clear, but the notation of the symbol is disputed.


 * this might be discussed on the page of the symbol
 * or on the page where the notation is defined.

Technical limitation: currently SWiM only uses one notation per symbol. You can provide more than one, but the last one will always win and will be used to render formulæ.

Theory refactoring
Note that refactoring support is highly desirable in our domain but that so far no assistance for that is available in SWiM. But of course, if there is one knowledge item that is eligible for being refactored, users will discuss about that. Here it's harder to figure out from the questions what refactoring task the users are actually discussing and then triggering some support for that.

(ask Immanuel!)

Splitting
Scenario:
 * 1) new theory developed, e.g. new mathematical model for sth. Includes all of the symbols and axioms needed
 * 2) * note: in the OMDoc/SWiM setting both the theory and each of its symbols and axioms would be separate knowledge items
 * 3) * the symbols, definitions, axioms point to their "home theory"
 * 4) later it's discovered that this theory consists of (w.l.o.g.) two disjoint subsets
 * 5) users suggest where to make the cut
 * 6) theory is split

Merging

 * 1) users identify two similar knowledge items and suggest merging them (common case even on Wikipedia)
 * 2) argumentation: which one should be merged into which one
 * 3) do the merge

Factoring out parts
Not exactly the same as splitting. You find a theory inclusion: i.e. that a subtheory of the current theory is, probably after application of a morphism, the same as another theory that already exists. I.e. that a subtheory of your current theory actually is a group or some common algebraic structure. So in the discussion posts you need pointers to an eligible candidate c, and then you need to refactor your own theory to import c.

Visualizing evolution
Use Timeline widget to show how new knowledge items were initiated by discussion posts about older knowledge items.

Taking it to the web

 * Uldis: consider feeding local SIOC data into search engines like SWSE to connect contributions of users registered locally to contributions they made elsewhere
 * Uldis: imagine external blogs commenting on content in "our" wiki, how can we make use of these comments?
 * imagine a Web 2.0 site hosting and interconnecting discussions about knowledge items in lots of other wikis

Community building

 * Uldis: work against the tendency of wikis to dissolve communities, to marginalize the user
 * Christoph: but I'm opportunistic: for me the interaction of the users is only a means to improving the knowledge, the content!
 * Thomas/Uldis: user rating, reputation (karma)

panta rhei

 * http://kwarc.info/projects/panta-rhei/ (Christine Müller: panta rhei)
 * mathematical knowledge (mainly lecture notes, but in principle anything), also using OMDoc
 * knowledge itself is read-only
 * discussions about knowledge items (but knowledge items tend to be larger, e.g. a whole section of a lecture, containing multiple symbol definitions)
 * classified discussion posts (home-made "ontology", technically not based on RDF, but using URIs for discussion posts), but same types of discussion posts (e.g. question, comment, explanation) for every type of knowledge item
 * discussions currently used for statistical purposes, later probably to identify communities, but not yet for triggering actions on the knowledge items (knowledge items are read-only)
 * we are not (yet?) interested in what communities the discussing users belong to, but only in their contributed suggestions
 * some categories of rating. Our focus is not on rating but on issue-tracking
 * potentially a higher number of users, because focus is on reading, not on authoring

Argumentation-based ontology engineering

 * Argumentation-Based Ontology Engineering (just read abstract so far):
 * authoring structured knowledge is almost the same as ontology engineering IMHO -- this is highly relevant! --Christoph 22:49, 28 April 2008 (CEST)
 * “restricting arguments increases … agreement, clarity, and satisfaction”

Discourses in wikis

 * Foucault@Wiki: First Steps Towards a Conceptual Framework for the Analysis of Wiki Discourses (a WikiSym publication)
 * analyses discourses in Wikipedia; how editing comments and postings on the discussion page relate to changes to the article, what aspects of change there are
 * our analysis is much easier, as the dimensions in our space are restricted (small number of possible annotations)
 * our knowledge items are much smaller than Wikipedia articles
 * small knowledge items are generally recommended in semantic wikis, at least in the prevalent model "1 page = 1 resource", as it helps to make more of the structure explicit

Speech acts

 * http://smile.deri.ie/projects/semanta/ (Simon Scerri: semantic e-mail)
 * based on speech act theory
 * Identify the intention of an e-mail via NLP (e-mails correspond to our knowledge items)
 * turn that into semantic metadata (not SIOC, but probably similar)
 * then trigger some actions/workflows (more elaborate schema of workflows)

Linked data

 * http://events.linkeddata.org/ldow2008/papers/01-bojars-passant-weaving-sioc.pdf (Uldis Bojārs, Alexandre Passant, Richard Cyganiak, John Breslin: Weaving SIOC into the web of linked data)
 * main focus: linking from discussions to knowledge items for the purpose of making related topics explicit, i.e. to improve navigation and retrieval (no action taken, no workflow triggered, knowledge graph not changed through discussions)

Other

 * Inquiry Driven Systems : Inquiry Into Inquiry. I cannot address the specific platforms that you are using, but many of the themes mentioned on this page resonate with my ongoing work.  The general framework in which I operate derives from C.S. Peirce's theories of information, inquiry, relations, and signs.  If anyone is interested, I will assemble some links to post here.  This may be more a thing for the future, though.  Jon Awbrey 13:22, 3 May 2008 (CEST)
 * TODO: find something on issue-tracking (semantic bug tracker, trouble ticket system?)

Snippets

 * Aniket Kittur, Bongwon Suh, Bryan A. Pendleton, Ed H. Chi: He says, she says: Conflict and Coordination in Wikipedia: number of edits of a Wikipedia discussion page are the most important metric for determining whether a page is subject to a conflict.
 * Wicked problem: a problem that doesn't have a true/false solution, but only a solution that's better or worse than the status quo (recommended by John)