Use the solrctl utility to manage a SolrCloud deployment. You can manipulate SolrCloud collections, SolrCloud collection instance directories, and individual cores.
A SolrCloud collection is the top-level object for indexing documents and providing a query interface. Each SolrCloud collection must be associated with an instance directory, though note that different collections can use the same instance directory. Each SolrCloud collection is typically replicated (sharded) among several SolrCloud instances. Each replica is called a SolrCloud core and is assigned to an individual SolrCloud host. The assignment process is managed automatically, although you can apply fine-grained control over each individual core using the core command. A typical deployment workflow with solrctl consists of deploying ZooKeeper coordination service, deploying solr-server daemons to each host, initializing the state of the ZooKeeper coordination service using init command, starting each solr-server daemon, generating an instance directory, uploading it to ZooKeeper, and associating a new collection with the name of the instance directory.
In general, if an operation succeeds, solrctl exits silently with a success exit code. If an error occurs, solrctl prints a diagnostics message combined with a failure exit code.
You can execute solrctl on any host that is configured as part of the SolrCloud. To execute any solrctl command on a host outside of SolrCloud deployment, ensure that SolrCloud hosts are reachable and provide --zk and --solr command line options.
The solrctl commands init, instancedir, collection, core, and cluster affect the entire SolrCloud deployment and are executed only once per required operation.
The solrctl core command affects a single SolrCloud host.
If you are using solrctl to manage your deployment in an environment that requires Kerberos authentication, you must have a valid Kerberos ticket, which you can get using kinit.
You can see examples of using solrctl in Deploying Cloudera Search.
Using solrctl with an HTTP proxy
Using solrctl to manage a deployment in an environment that uses an http_proxy fails because solrctl uses curl, which attempts to use the web proxy. You can disable the proxy so solrctl succeeds:
- Modify the settings for the current shell by
exporting the NO_PROXY. For
$ export NO_PROXY='*'
- Modify the settings for single commands by
prefacing solrctl commands with
$ NO_PROXY='*' solrctl collection --create yourCollectionName
You can initialize the state of the entire SolrCloud deployment and each individual host within the SolrCloud deployment by using solrctl. The general solrctl command syntax is:
solrctl [options] command [command-arg] [command [command-arg]] ...
Each element and their possible values are described in the following sections.
- --solr solr_uri: Directs solrctl to a SolrCloud web API available at a given URI. This option is required for hosts running outside of SolrCloud. A sample URI might be: http://host1.cluster.com:8983/solr.
- --zk zk_ensemble: Directs solrctl to a particular ZooKeeper coordination service ensemble. This option is required for hosts running outside of SolrCloud. For example: host1.cluster.com:2181,host2.cluster.com:2181/solr.
- --help: Prints help.
- --quiet: Suppresses most solrctl messages.
- init [--force]: The init command, which initializes the overall state of the SolrCloud deployment, must be executed before starting solr-server daemons for the first time. Use this command cautiously because it erases all SolrCloud deployment state information. After successful initialization, you cannot recover any previous state.
- instancedir [--generate path
[-schemaless]] [--create name path] [--update name path] [--get
name path] [--delete name] [--list]: Manipulates the
instance directories. The following options are supported:
- --generate path: Allows
users to generate the template of the instance directory. The
template is stored at a designated path in a local filesystem
and has configuration files under /conf. See Solr's
README.txt for the complete layout.
- -schemaless A schemaless template of the instance directory is generated. For more information on schemaless support, see Using Schemaless Mode (CDH 5.1 or later only).
- --create name path: Pushes a copy of the instance directory from the local filesystem to SolrCloud. If an instance directory is already known to SolrCloud, this command fails. See --update for changing name paths that already exist.
- --update name path: Updates an existing SolrCloud copy of an instance directory based on the files in a local filesystem. This command is analogous to first using --delete name followed by --create name path.
- --get name path: Downloads the named collection instance directory at a given path in a local filesystem. Once downloaded, files can be further edited.
- --delete name: Deletes the instance directory name from SolrCloud.
- --list: Prints a list of all available instance directories known to SolrCloud.
- --generate path: Allows users to generate the template of the instance directory. The template is stored at a designated path in a local filesystem and has configuration files under /conf. See Solr's README.txt for the complete layout.
- collection [--create name -s
<numShards> [-c <collection.configName>] [-r
<replicationFactor>] [-m <maxShardsPerHost>] [-n
<createHostSet>]] [--delete name] [--reload name] [--stat
name] [--list] [--deletedocs name]: Manipulates
collections. The following options are supported:
- --create name -s <numShards>
[-a] [-c <collection.configName>] [-r
<replicationFactor>] [-m <maxShardsPerHost>] [-n
<createHostSet>]]: Creates a new
New collections are given the specified name, and are sharded to <numShards>.
The -a option configures auto-addition of replicas if machines hosting existing shards become unavailable.
SolrCloud hosts are configured using the <collection.configName> instance directory. Replication is configured by a factor of <replicationFactor>. The maximum shards per host is determined by <maxShardsPerHost>, and the collection is allocated to the hosts specified in <createHostSet>.
The only required parameters are name and numShards. If collection.configName is not provided, it is assumed to be the same as the name of the collection.
- --delete name: Deletes a collection.
- --reload name: Reloads a collection.
- --stat name: Outputs SolrCloud specific run-time information for a collection.
- --list: Lists all collections registered in SolrCloud.
- --deletedocs name: Purges all indexed documents from a collection.
- --create name -s <numShards> [-a] [-c <collection.configName>] [-r <replicationFactor>] [-m <maxShardsPerHost>] [-n <createHostSet>]]: Creates a new collection.
- core [--create name [-p name=value]...]
[--reload name] [--unload name] [--status name]:
Manipulates cores. This is one of two commands that you can execute
on a particular SolrCloud host.
this expert command with caution. The following
options are supported:
- --create name [-p name=value]...]: Creates a new core on a given SolrCloud host. The core is configured using name=values pairs. For more details on configuration options, see Solr documentation.
- --reload name: Reloads a core.
- --unload name: Unloads a core.
- --status name: Prints status of a core.
[--get-solrxml file] [--put-solrxml file]: Manages
cluster configuration. The following options are supported:
- --get-solrxml file: Downloads the cluster configuration file solr.xml from ZooKeeper to the local system.
- --put-solrxml file: Uploads the specified file to ZooKeeper as the cluster configuration file solr.xml.
|<< Using Hue with Cloudera Search||Spark Indexing Reference (CDH 5.2 or later only) >>|