Indexing in Marklogic

January 01, 2022

MarkLogic makes use of multiple types of indexes to resolve queries.

Index Topics:

The Universal Index
Other Types of Indexes
Index Size
Fields
Reindexing
Relevance
Indexing Document Metadata
Fragmentation of XML Documents

The Universal Index

The universal index that indexes the XML or JSON structure(data) by default to make without any changes in Database or server configuration.

Universal Index is:

Word Indexing
Phrase Indexing
Relationship Indexing
Value Indexing
Word and Phrase Indexing

Word Indexing: Marklogic indexes every word or character during the ingestion of any document into Marklogic.

Inverted index feature enables MarkLogic Server to resolve word queries, which is a list of all of the words in all the documents in the database and, for each word, a list of which documents have that word.

Inverted index:

MarkLogic server index the XML or JSON data in form of a document-word relationship. An inverted index is called "inverted" because it inverts the relationship of words to documents.

Term list:

Each entry in the inverted index is called a term list. A term can be a word or a phrase consisting of multiple words.

Phrase Indexing: MarkLogic uses term lists to find documents that contain both words and then look inside the candidate documents to determine if the two words appear together in a phrase.

Enable word positions to include position information for each term in the term list. That way you know at what location in the document each word appears. By looking for term hits with adjoining positions, you can find the answer using just indexes. This avoids having to look inside any documents.

Relationship Indexing:

In addition to creating a term list for each word, MarkLogic creates a term list for each XML element or JSON property in documents. MarkLogic automatically indexes element relationships to keep track of parent-child element hierarchies.

Range Indexes

Marklogic provides the functionalities to the user for creating an index on any element or attribute based on project requirements.

We create range indexes in Marklogi for the following reasons:

Perform fast-range queries.
Quickly extract specific values from the entries in a result set.
Perform optimized order by calculations.
Perform efficient cross-document joins.
Quickly extract co-occurring values from the entries in a result set.

Some main range indexes are:

Element Range Indexes
Attribute Range Indexes
Path Range Indexes
Field Range Indexes
Element Word Lexicons
Attribute Word Lexicons

Create range element index by the Admin UI Page:

Open configured Database on admin UI Page, click on "Element Range Index" and click on "Add" button and fill your element name and other details and click "OK".

Create range element index by the code:

xquery version "1.0-ml";

import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";

(:Get Config file:)

let $config := admin:get-configuration()

(:Get Data base ID:)

let $dbid := xdmp:database("testing-db")

(:Create Range index on element:)

let $rangespec := admin:database-range-element-index("string", "/my/namespace", "elementname", "http://marklogic.com/collation/", fn:false())

(:Add created indexed on database:)

let $new-config := admin:database-add-range-element-index($config, $dbid, $rangespec)

return admin:save-configuration($new-config)

The main difference between an element range index and an element word lexicon is that the range index will store the string value of the element ("search terms"), and the word lexicon will store individual word tokens ("search", "terms").

Search This Blog

XQuery Tutorial

Indexing in Marklogic

Range Indexes

Comments

Post a Comment

Popular posts from this blog

Xquery

HTTP Servers, Database, Forest

Marklogic Fundamentals