Use the solrctl utility to manage a SolrCloud deployment, completing tasks such as manipulating SolrCloud collections, SolrCloud collection instance directories, and individual cores.
A SolrCloud collection is the top level object for indexing documents and providing a query interface. Each SolrCloud collection must be associated with an instance directory, though note that different collections can use the same instance directory. Each SolrCloud collection is typically replicated (also known as sharded) among several SolrCloud instances. Each replica is called a SolrCloud core and is assigned to an individual SolrCloud host. The assignment process is managed automatically, though users can apply fine-grained control over each individual core using the core command. A typical deployment workflow with solrctl consists of deploying ZooKeeper coordination service, deploying solr-server daemons to each host, initializing the state of the ZooKeeper coordination service using init command, starting each solr-server daemon, generating an instance directory, uploading it to ZooKeeper, and associating a new collection with the name of the instance directory.
In general, if an operation succeeds, solrctl exits silently with a success exit code. If an error occurs, solrctl prints a diagnostics message combined with a failure exit code.
You can execute solrctl on any host that is configured as part of the SolrCloud. To execute any solrctl command on a host outside of SolrCloud deployment, ensure that SolrCloud hosts are reachable and provide --zk and --solr command line options.
The solrctl commands init, instancedir, collection, core, and cluster affect the entire SolrCloud deployment and are executed only once per required operation.
The solrctl core command affects a single SolrCloud host.
If you are using solrctl to manage your deployment in an environment that requires Kerberos authentication, you must have a valid Kerberos ticket, which you can get using kinit.
Using solrctl with an HTTP proxy
Using solrctl to manage a deployment in an environment that uses an http_proxy fails. This is because solrctl uses curl, which attempts to use the web proxy. There are various ways that you can disable the proxy so solrctl succeeds:
- You can modify the settings for the current shell by
exporting the NO_PROXY. For
$ export NO_PROXY='*'
- You can modify the settings for single commands by prefacing
solrctl commands with NO_PROXY='*'. For
$ NO_PROXY='*' solrctl collection --create yourCollectionName
You can initialize the state of the entire SolrCloud deployment and each individual host within the SolrCloud deployment using solrctl. The general solrctl command syntax is of the form:
solrctl [options] command [command-arg] [command [command-arg]] ...
Each part of these elements and their possible values are described in the following sections.
- --solr solr_uri: Directs solrctl to a SolrCloud web API available at a given URI. This option is required for hosts running outside of SolrCloud. A sample URI might be: http://host1.cluster.com:8983/solr.
- --zk zk_ensemble: Directs solrctl to a particular ZooKeeper coordination service ensemble. This option is required for hosts running outside of SolrCloud. For example: host1.cluster.com:2181,host2.cluster.com:2181/solr.
- --help: Prints help.
- --quiet: Suppresses most solrctl messages.
- init [--force]: The init command, which initializes the overall state of the SolrCloud deployment, must be executed before starting solr-server daemons for the first time. Use this command cautiously as it is a destructive command that erases all SolrCloud deployment state information. After a successful initialization, it is impossible to recover any previous state.
[--generate path [-schemaless]] [--create name path] [--update name
path] [--get name path] [--delete name] [--list]: Manipulates
the instance directories. The following options are supported:
- --generate path: Allows
users to generate the template of the instance directory. The
template is stored at a given path in a local filesystem and it has
the configuration files under /conf. See Solr's README.txt for the complete layout.
- -schemaless A schemaless template of the instance directory is generated. For more information on schemaless support, see Using Schemaless Mode (CDH 5.1 or later only).
- --create name path: Pushes a copy of the instance directory from local filesystem to SolrCloud. If an instance directory is already known to SolrCloud, this command fails. See --update for changing name paths that already exist.
- --update name path: Updates an existing SolrCloud's copy of an instance directory based on the files present in a local filesystem. This can be thought of us first using --delete name followed by --create name path.
- --get name path: Downloads the named collection instance directory at a given path in a local filesystem. Once downloaded, files can be further edited.
- --delete name: Deletes the instance directory name from SolrCloud.
- --list: Prints a list of all available instance directories known to SolrCloud.
- --generate path: Allows users to generate the template of the instance directory. The template is stored at a given path in a local filesystem and it has the configuration files under /conf. See Solr's README.txt for the complete layout.
[--create name -s <numShards> [-c <collection.configName>]
[-r <replicationFactor>] [-m <maxShardsPerHost>] [-n
<createHostSet>]] [--delete name] [--reload name] [--stat name]
[--list] [--deletedocs name]: Manipulates collections. The
supported options have the following purpose:
- --create name -s
<numShards> [-a] [-c <collection.configName>] [-r
<replicationFactor>] [-m <maxShardsPerHost>] [-n
<createHostSet>]]: Creates a new collection.
New collections are given the specified name, and are sharded to <numShards>.
The -a option configures auto-addition of replicas if machines hosting existing shards become unavailable.
SolrCloud hosts are configured using <collection.configName> instance directory. Replication is configured by a factor of <replicationFactor>. The maximum shards per host is determined by <maxShardsPerHost>, and the collection is allocated to the hosts specified in <createHostSet>.
The only required parameters are name and numShards. If collection.configName is not given, it is assumed to be the same as the name of the collection.
- --delete name: Deletes a collection.
- --reload name: Reloads a collection.
- --stat name: Outputs SolrCloud specific run-time information for a collection.
- --list: Lists all collections registered in SolrCloud.
- --deletedocs name: Purges all indexed documents from a collection.
- --create name -s <numShards> [-a] [-c <collection.configName>] [-r <replicationFactor>] [-m <maxShardsPerHost>] [-n <createHostSet>]]: Creates a new collection.
[--create name [-p name=value]...] [--reload name] [--unload name]
[--status name]: Manipulates cores. This is one of the two
commands that you can execute on a particular SolrCloud host. Use this
expert command with caution. The following options are supported:
- --create name [-p name=value]...]: Creates a new core on a given SolrCloud host. The core is configured using name=values pairs. For more details on configuration options see Solr documentation.
- --reload name: Reloads a core.
- --unload name: Unloads a core.
- --status name: Prints status of a core.
- cluster [--get-solrxml file]
[--put-solrxml file]: Manages cluster configuration. The
following options are supported:
- --get-solrxml file: Downloads the cluster configuration file solr.xml from ZooKeeper to the local system.
- --put-solrxml file: Uploads the specified file to ZooKeeper as the cluster configuration file solr.xml.