The Solr Service
Cloudera Search is implemented by the Solr service.
Installing the Solr Service
- Automated Installation by Cloudera Manager: This method (Installation Path A) performs a CDH installation using Cloudera Manager's installation wizard. As part of the installation, Cloudera Manager offers to install Cloudera Search. You can install CDH (and Search) using parcels (recommended) or packages. See the Cloudera Manager Installation Guide for further information.
- Installation Using Your Own Method: If you use this method (Installation Path B), then you need to follow the instructions given at "Installing Cloudera Search" in the Cloudera Search Installation Guide for manually installing all the required packages.
Adding a Solr Service
- Connect to the Cloudera Manager Admin Console.
- Click the Services tab, then choose All Services.
- From the Actions menu, select Add a Service.
- Choose the Solr service.
- Follow the wizard for adding Solr service to your cluster. Select which hosts on your cluster to add and configure the Solr Servers.
After completing the wizard, Cloudera Manager automatically initializes Solr home in ZooKeeper and HDFS.
Once you have set up the Solr service, you can create collections by following the instructions in the Cloudera Search Installation Guide, in the section "Deploying Cloudera Search in SolrCloud Mode", under the heading "Administering Solr with the solrctl Tool."
Using Flume with Search
To use a Flume Solr sink, the Flume service must be running on your cluster. See The Flume Service.
Configuring Flume Morphline Solr Sink for use with the Solr Service
See the Cloudera Search User Guide, specifically the section "Flume Near Real-Time Indexing Reference" for information about how to configure Flume Morphline Solr Sink.
- Go to the Flume service.
- Select .
- Under the Agent role group, find the Configuration File property that holds the flume.conf file. This is the primary configuration file for Flume agents. Modify this file (or paste your own version in here). Note that there could be more than one Agent role group -- if so, you will need to configure each one appropriately.
- Under the Agent role group, go to the Flume-NG Solr Sink category. Here you will find the following properties:
- Morphlines File (morphlines.conf) - Configures Morphlines for Flume agents. Note that you should use $ZK_HOST in this file instead of specifying a ZooKeeper quorum. Cloudera Manager automatically replaces the $ZK_HOST variable with the correct value during the Flume configuration deployment.
- Custom MIME-types File (custom-mimetypes.xml) — for use with the detectMimeTypes command. See the Cloudera Morphlines Reference Guide for details on this command.
- Grok Dictionary File (grok-dictionary.conf) — for use with the grok command. See the Cloudera Morphlines Reference Guide for details of this command.
Once configuration is complete, Cloudera Manager automatically deploys the required files to the Flume agent's process directory when it starts the Flume agent. Therefore, you can reference the files in the Flume agent's configuration file using only their (relative path) names. For example, in flume.conf you can use the name morphlines.conf to refer to the location of the morphlines configuration file.
Deploying Search with Hue
- Go to the Hue service.
- Select .
- Search for the word "safety". This will display a set of Hue Safety Valve properties
- Add information about your Solr host to the Hue Server Configuration Safety Valve for hue_safety_valve_server.ini found under the Hue Server (Default) / Advanced category. For example, if your hostname is SOLR_HOST, you might add the following:
[search] ## URL of the Solr Server solr_url=http://SOLR_HOST:8983/solr
- Save Changes to save your safety valve changes.
- Restart the Hue Service.
- Stop the Hue service.
- From the command line do the following:
cd /opt/cloudera/parcels/CDH4.3.0-1.cdh4.3.0.pXXX/share/hue(Substitute your own local repository path for the /opt/cloudera/parcels/... if yours is different, and specify the appropriate name of the CDH4.3 parcel that exists in your repository.)
./build/env/bin/python ./tools/app_reg/app_reg.py --install /opt/cloudera/parcels/SOLR-0.9.0-1.cdh4.3.0.pXXX/share/hue/apps/search
sed -i 's/\.\/apps/..\/..\/..\/..\/..\/apps/g' ./build/env/lib/python2.X/site-packages/hue.pthwhere python should be the version you are using (e.g. python2.4).
- Start the Hue service.