The semantic search engine Koios++ is used to search for facts in an RDF triplestore.

Project Description


In contrast to the purely syntactical comparison of character strings, the semantic search is based on the contentual meaning of a query. This is relevant for solving the problems posed by synonyms (intelligent, smart) and homonyms (row: propel with oars, linear arrangement of seating). The semantic search is concept-based, as the machine uses knowledge and associations to provide the answer, very much like human thought. The structure of the (domain-specific) knowledge to be used by this type of search engine is created using thesauri, semantic networks and ontologies. Formally, data formats, which were developed in association with the Semantic Web, provide a machine-readable representation of knowledge: OWL (Web Ontology Language) and RDFS (Resource Description Framework Schema) serve the description of ontologies. RDFS can be used to construct large knowledge networks, which are similar to semantic networks. Every statement in an RDF(S) contains three elements, which are denoted as the subject, predicate and object. In this order, the elements are condensed into a triplet (3-tuple), which can then be used to represent natural language sentences simply. In a bibliographic context, an example of such a triplet could be: "Person P is the author von work W". The assignment of meaning takes place by means of linking the statements to ontological concepts. Knowledge (a large number of triplets) is stored in so-called triplestores. Queries of this knowledge base usually take place using the language SPARQL (SPARQL Protocol And RDF Query Language).

Knowledge networks for psychology

As an important foundation for the realization of the semantic search, the construction of a knowledge network for the field of psychology was tackled by the DFKI and ZPID in a joint project with four work packages.

  • (1) Development of ontologies: Analysis of classification schemes and bibliographic data from PSYNDEX, PsychAuthors and ERIC. Conversion into SKOS (Simple Knowledge Organization System) or BIBO (Bibliographic Ontology). Storage in a subversion repository.
  • (2) Implementation of a ZPID-SPARQL server: Based on the Virtuoso Open Source-Edition, a server infrastructure was realized, which can be used to store and query developed ontologies and corresponding instance data.
  • (3) Data import: Import functionalities were developed, which allow the automatic transfer of the data archives in PsychAuthors and PSYNDEX into the semantic infrastructure.
  • (4) For the completion of the knowledge network, word N-grams were specified, which correspond to the terms in the APA thesaurus, determine further word forms and add alternative terms. This query interpretation process compensates for the, from a professional and conceptual standpoint, incomplete vocabulary of the users.

The semantic search engine KOIOS++

The semantic search engine KOIOS++ realizes a key word based search of RDF(S) data. With the help of the data, key words are depicted in SPARQL-Queries, which are then used for the actual factual search. The advantage of this approach is that the users do not require any explicit knowledge of the scheme or SPARQL language. RDF data stored in the knowledge network were made searchable with the help of the search engine Koios++. In this case, Koios++ makes a key word based search for psychology literature possible. The key words are depicted on isolated elements of the RDF-data and subsequently a link between these elements is searched for. The results of the process described are displayed using a semantic network, which simultaneously illustrates the results of the search. Normally, Koios++ calculates several results for one query, as the analyses of the entered key words allows for several possible interpretations.

During information search, a fundamental problem is that the user cannot sufficiently link the results of a corresponding system to his own entry. This is especially likely when the user cannot formulate his search problem adequately (e.g. when using key words). For these reasons, Koios++ attempts to enhance search results by using explanations in the form of semantic networks and, in doing so, to enter into a kind of dialogue with the user, to solve his search problem and assist in the acquisition of new knowledge.

Koios++ attempts to adapt explanations of search results to each user by adding or masking information. In doing so, statistical and logical methods are employed, which, in particular, take the knowledge of the user in the relevant field into consideration. For this purpose, a formula for the evaluation of the comprehensibility of semantic networks was developed. In this formula, main considerations are the degree of familiarity of terms (knots in the network) and the transparency of term linkage (lines in the network). The later was investigated in an experiment last year: a publication of the results is in preparation.

Search for psychology literature

Koios++’s generic approach can also be used to search for literature in the field of psychology. In this case, the SPARQL queries receive exactly 1 variable, the instantiation of which can be assigned to a specific class of document (article, conference contribution etc.) The instances or concrete documents are annoted with concepts from the APA thesaurus that are normally kept as specific as possible. Assuming a user is looking for a document associated with the terms "depression" and "psychoanalysis", Koios++ would, in this case, also suggest a document that is annoted with the concepts "bipolar disorder" and "psychoanalysis", even if the word "depression" does not appear in the text. This suggestion is only possible because Koios++ uses its background knowledge to infer that the concept "bipolar disorder" is closely related to "depression". Principally, using this method increases the number if hits, which is not inherently advantageous. For this reason, a document search with Koios++ mainly provides useful results, when the query includes several concepts.

Further semantics search engines

A semantic search engine in the field of medicine is GoPubMed, while Wolfram Alpha is successful as a calculating knowledge engine.



