The Lily HBase Indexer Service
The Lily HBase Indexer service allows you to query data stored in HBase with Search. The indexer service indexes the stream of records being added to HBase tables.
The Lily HBase Indexer service is installed in the same parcel or package along with Cloudera Search. Like Cloudera Search, it is not deployed by the installation wizard, but must be added as a service after the installation process has finished.
The HBase indexer depends upon HBase, Solr, and ZooKeeper, all of which must be already deployed.
Adding the Indexer Service to the Cluster
Once the initial cluster has been set up, you can add the Cloudera Search service to this cluster.
- Click the Services menu, then choose All Services.
- From the Actions menu, select Add a Service. A list of possible services are displayed. You can add one type of service at a time.
- Choose the Keystore Indexer service.
- Follow the wizard for adding Keystore
Indexer service to your cluster.
- Select which hosts on your cluster should run the HBase Indexer roles.
- Select or confirm the dependent services.
- The service is not started automatically: you must first enable HBase replication and indexing, and then start the service.
To enable Morphlines with Search and HBase indexing:
Cloudera Morphlines is an open source framework that reduces the time and skills necessary to build or change Search indexing applications. A morphline is a rich configuration file that simplifies defining an ETL transformation chain.
- Select the newly-added Indexer service, and from the Configuration menu, select View and Edit.
- Create the necessary configuration files, and modify the content
in the following properties under the Service-Wide
> Morphlines category:
- Morphlines File — Text that goes into the morphlines.conf used by HBase indexers. Note that you should use $ZK_HOST in this file instead of specifying a ZooKeeper quorum. Cloudera Manager automatically replaces the $ZK_HOST variable with the correct value during the Solr configuration deployment.
- Custom MIME-types File — Text that goes verbatim into the custom-mimetypes.xml file used by HBase Indexers with the detectMimeTypes command. See the Cloudera Morphlines Reference Guide for details on this command.
- Grok Dictionary File — Text that goes verbatim into the grok-dictionary.conf file used by HBase Indexers with the grok command. See the Cloudera Morphlines Reference Guide for details of this command.
To enable HBase indexing:
- From the Services menu, select the appropriate HBase service.
- From the Configuration menu, select View and Edit.
- Select the Backup category.
- Check (enable) the properties for Enable Replication and Enable Indexing, and Save Changes.
- From the HBase Actions menu, Restart the HBase service.
- When the HBase service has restarted, you can start the Indexer service.
For information on using the Indexer, see Using the HBase Indexer Service.
|<< Previous: Setting Up Search Authorization with Sentry||Next: The Oozie Service >>|