This is the documentation for Cloudera Manager 4.8.5.
Documentation for other versions is available at Cloudera Documentation.

The Lily HBase Indexer Service

The Lily HBase Indexer service allows you to query data stored in HBase with Search. The indexer service indexes the stream of records being added to HBase tables.

The Lily HBase Indexer service is installed in the same parcel or package along with the Solr service. Like the Solr service, it is not deployed by the installation wizard, but must be added as a service after the installation process has finished.

  Note: The HBase, ZooKeeper, Solr, and HDFS services are dependencies required for the Indexer service.
  Note: Cloudera Manager allows you to add Solr and Key-Value Store Indexer services even if the CDH version deployed in your cluster (for example, CDH 4.4) does not support Cloudera Search. However, you will not be able to start the services.

Adding the Indexer Service

  1. Click the Services menu, then choose All Services.
  2. From the Actions menu, select Add a Service. A list of possible services are displayed. You can add one type of service at a time.
  3. Choose the Key-Value Store Indexer service.
  4. Follow the wizard for adding the service to your cluster.
    1. Select which hosts on your cluster should run the HBase Indexer roles.
    2. Select or confirm the dependent services.
  5. The service is not started automatically: you must first enable HBase replication and indexing, and then start the service.

Enabling Morphlines with Search and HBase Indexing

Cloudera Morphlines is an open source framework that reduces the time and skills necessary to build or change Search indexing applications. A morphline is a rich configuration file that simplifies defining an ETL transformation chain.

  1. Select the newly-added Indexer service, and from the Configuration menu, select View and Edit.
  2. Create the necessary configuration files, and modify the content in the following properties under the Service-Wide > Morphlines category:
    • Morphlines File — Text that goes into the morphlines.conf used by HBase indexers. Note that you should use $ZK_HOST in this file instead of specifying a ZooKeeper quorum. Cloudera Manager automatically replaces the $ZK_HOST variable with the correct value during the Solr configuration deployment.
    • Custom MIME-types File — Text that goes verbatim into the custom-mimetypes.xml file used by HBase Indexers with the detectMimeTypes command. See the Cloudera Morphlines Reference Guide for details on this command.
    • Grok Dictionary File — Text that goes verbatim into the grok-dictionary.conf file used by HBase Indexers with the grok command. See the Cloudera Morphlines Reference Guide for details of this command.
    See Extracting, Transforming, and Loading Data With Cloudera Morphlines for information about using morphlines with Search and HBase.

Enabling HBase Indexing

  1. From the Services menu, select the HBase service.
  2. Select Configuration > View and Edit.
  3. Select the Backup category.
  4. Check the properties for Enable Replication and Enable Indexing, and click Save Changes.
  5. Restart the HBase service.
  6. Start the Indexer service.

For information on using the Indexer, see Using the Lily HBase NRT Indexer Service.