Validating the Cloudera Search Deployment

After installing and deploying Cloudera Search, you can validate the deployment by indexing and querying sample documents. Before beginning this process, make sure you have access to the Apache Solr admin web console, as described in Creating Collections.

Creating a Solr Collection

  1. On a host running Solr Server, make sure that the SOLR_ZK_ENSEMBLE environment variable is set in /etc/solr/conf/solr-env.sh. For example:
    $ cat /etc/solr/conf/solr-env.sh
    export SOLR_ZK_ENSEMBLE=zk01.example.com:2181,zk02.example.com:2181,zk03.example.com:2181/solr

    If you are using Cloudera Manager, this is automatically set on hosts with a Solr Server or Gateway role.

  2. Generate configuration files for the collection:
    $ solrctl instancedir --generate $HOME/test_config
  3. Upload the configuration to ZooKeeper:
    $ solrctl instancedir --create test_config $HOME/test_config
  4. Create a new collection with two shards by using the uploaded configuration directory:
    $ solrctl collection --create test_collection -s 2 -c test_config

Indexing Sample Data

Cloudera Search includes sample data for testing and validation. Run the following commands to index this data for searching. Replace search01.example.com in the example below with the name of any host running the Solr Server process.
  • Parcel-based Installation:
    $ cd /opt/cloudera/parcels/CDH/share/doc/solr-doc*/example/exampledocs
    $ java -Durl=http://search01.example.com:8983/solr/test_collection/update -jar post.jar *.xml
  • Package-based Installation:
    $ cd /usr/share/doc/solr-doc*/example/exampledocs
    $ java -Durl=http://search01.example.com:8983/solr/test_collection/update -jar post.jar *.xml

Querying Sample Data

Run a query to verify that the sample data is successfully indexed and that you are able to search it:

  1. Open the Solr admin web interface in a browser by accessing http://search01.example.com:8983/solr. Replace search01.example.com with the name of any host running the Solr Server process.
  2. Select Cloud from the left panel.
  3. Select one of the hosts listed for the test_collection collection.
  4. From the Core Selector drop-down menu in the left panel, select the test_collection shard.
  5. Select Query from the left panel and select Execute Query. If you see results such as the following, indexing was successful:
      "response": {
        "numFound": 32,
        "start": 0,
        "maxScore": 1,
        "docs": [
          {
            "id": "SP2514N",
            "name": "Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133",
            "manu": "Samsung Electronics Co. Ltd.",
            "manu_id_s": "samsung",
            "cat": [
              "electronics",
              "hard drive"
            ],

Next Steps

After you have verified that Cloudera Search is installed and running properly, you can experiment with other methods of ingesting and indexing data:

To learn more about Solr, see the Apache Solr Tutorial.