Text and Knowledge Modeling

Words are grouped, as a sentence, as a paragraph, like a chapter or part in a book, as a corpus, and across corpora. Words that exist in one group may appear in another grouping, or may not appear at all. For example, complex legal technical words will unlikely appear in a novel such as the Harry Potter series. Likewise, dramatic words are seldom used in a technical book like ours.

Groupings represent a higher dimension for words within texts. Hence any analysis must provide a holistic view of the word, within its context of groupings, whether it is in vocabulary sense, sense of grammar, semantical meaning, as well as pragmatical context.

Now we see words within language as an element of a complex system; which may at the same also complicated. The complex system here means a specific language is understood as a complex system and may be extremely complicated (depending on the language), and surely it is highly complex and highly complicated when viewed across languages and culture, and other dimensions such as psycho-linguistics.

In this part, we add two chapters that will be focused on Graph Theory (or Network Science) as the base methodology. In Chapter 7, we will focus on “text networks” where we study texts as a network of words. In Chapter 8, we then focus on “text classification models”. The focus is on how the texts are “grouped” and what are the classifications for the groups. The final chapter of this part, Chapter 9, discusses the subject of Knowledge Graphs, where the focus is on the Tafseer of Imam Ibnu Katheer.

Some major unanswered questions are summarized at the end of the chapter as directions for future research.