Sphinx Architectural Overview

Sphinx is a standalone set of programs. The two main programs are:

indexer
A program that fetches documents from specified sources (e.g., from MySQL query results) and creates a full-text index over them. This is a background batch job, which sites usually run regularly.

searchd
A daemon that serves search queries from the indexes indexer builds. This provides the runtime support for applications.

The Sphinx distribution also includes native searchd client APIs in a number of programming languages (PHP, Python, Perl, Ruby, and Java, at the time of this writing), and the SphinxSE, which is a client implemented as a pluggable storage engine for MySQL 5.0 and newer. The APIs and SphinxSE allow a client application to connect to searchd, pass it the search query, and fetch back the search results.

Each Sphinx full-text index can be compared to a table in a database; in place of rows in a table, the Sphinx index consists of documents. (Sphinx also has a separate data structure called a multivalued attribute, discussed later.) Each document has a unique 32-bit or 64 bit integer identifier that should be drawn from the database table being indexed (for instance, from a primary key column). In addition, each document has one or more full-text fields (each corresponding to a text column from the database) and numerical attributes. Like a database table, the Sphinx index has the same fields and attributes for all of its documents. Table C-1 shows the analogy between a database table and a Sphinx index.


Database structure
CREATE TABLE documents (
id` int(11) NOT NULL auto_increment,
title` varchar(255),
content` text,
group_id` int(11),
added` datetime,
PRIMARY KEY (id)
);


Sphinx structure
index documents
document ID
title field, full-text indexed
content field, full-text indexed
group_id attribute, sql_attr_uint
added` attribute, sql_attr_timestamp


Sphinx does not store the text fields from the database but just uses their contents to build a search index.

Source of Information : OReIlly High Performance MySQL Second Edition

0 comments


Subscribe to Developer Techno ?
Enter your email address:

Delivered by FeedBurner