Installing Apache Solr
This page describes how to install and start Apache Solr and create a new Solr core based on a Magnolia configuration set. Apache Solr is a standalone enterprise-grade search platform that’s needed together with the Magnolia Solr module for high-performance searches of large volumes of documents.
The installation procedure for Apache Solr described on this page has been reduced to just a minimum of steps required to setup, start and use Solr with a Magnolia instance.
For a full account please refer to the official Solr documentation at solr.apache.org/guide/. |
Solr module compatibility with Apache Solr
Module version | Solr version |
---|---|
|
Solr |
|
Solr |
|
Solr |
|
Solr |
Getting Apache Solr
Download Apache Solr and extract the zip to your computer.
Version-specific installation notes
Version 5.2
This version contains changes in Please read the notes below before updating. |
Changes in Solr 5.x configuration files
managed-schema
-
New fields:
-
jcrname
(<field name="jcrname" type="string" stored="true" indexed="true"/>
). -
nodetype
(<field name="nodetype" type="string" stored="true" indexed="true"/>
). -
(dynamic)
asset_*
(<dynamicField name="asset_*" type="text_general" indexed="true" stored="true" multiValued="false"/>
).
-
-
Removed:
-
<copyField source="*" dest="text"/>
(replaced byCloneFieldUpdateProcessorFactory
insolrconfig.xml
, see also below). -
Dynamic field
*_point
(in collision with the*_point
fields when parsed by Apache Tika in documents). -
Dynamic field
*_id (Indonesian)
(in collision with the*_id
fields when parsed by Apache Tika in documents).
-
solrconfig.xml
-
Changes in
ExtractingRequestHandler
:-
By default only document content is indexed (into the
asset_content
field). -
All fields that are not defined in the schema are ignored (
<str name="uprefix">ignored_</str>
).
-
-
The
CloneFieldUpdateProcessorFactory
was added into theadd-unknown-fields-to-the-schema
update request processor chain. This replaces<copyField source="*" dest="text"/>
, which was removed from themanaged-schema
. -
The
uuid
,version
,id
,path
,workspace
,nodetype
,assetproviderid
,url
,type
and allignored
fields are by default excluded from copying to thetext
field.
Version 5.0.2
This version contains changes in solrconfig.xml and managed-schema please read the notes before updating to 5.0.2.
|
Updating to 5.0.2
Option 1
If you don’t plan to index same content by two different indexers or crawlers then you don’t need to update your solrconfig.xml
and managed-schema
for your solr core. Only change what you need to do is add uniqueKeyField
property with value id
into your solr sear result page.
Option 2
Use the link:https://git.magnolia-cms.com/projects/ENTERPRISE/repos/solr-search-provider/raw/magnolia-solr-search-provider/src/main/resources/solr-config-files/managed-schema at=refs%2Ftags%2Fmagnolia-solr-search-provider-5.0.2[managed-schema^] and solrconfig.xml configuration files for your solr core and for $SOLR_HOME/server/solr/configsets/magnolia_data_driven_schema_config
.
It’s needed to recreate all Solr indexes, because of the changes in configuration files. Probably the easiest way to do it is recreate the solr core and then retrigger indexing in Magnolia.
-
Use new
solrconfig.xml
andmanaged-schema
configuration files for$SOLR_HOME/server/solr/configsets/magnolia_data_driven_schema_config
Magnolia config set. -
Delete
magnolia
core an create it againcd $SOLR_HOME/bin ./solr delete -c magnolia ./solr create_core -c magnolia -d magnolia_data_driven_schema_configs
-
Retrigger the indexers, by changing their property
indexed
tofalse
Creating a configuration set
Create a new Magnolia configuration set by duplicating the $SOLR_HOME/server/solr/configsets/_default
folder and then naming it to magnolia_data_driven_schema_configs
.
In this new configuration set, you need to create or modify two files, solrconfig.xml
and managed-schema
:
-
solrconfig.xml
, a configuration file with the most parameters affecting Solr itself. -
managed-schema
, a file that specifies what fields the Magnolia content can contain, how those fields are added to the index, and how they are queried.
For further details see https://solr.apache.org/guide/8_11/documents-fields-and-schema-design.html. |
Configuration example
Please be aware that different Solr versions may require different content in the Solr configuration files. The example configuration files attached below are for Solr 8.11.1 and have been tested against version 6.1 of the Magnolia Solr module. |
Download the following example configuration files (based on Solr data_driven_schema_configs Config sets) and replace with them the default files in the newly created set magnolia_data_driven_schema_configs/conf
:
Creating a new core
A core is a running instance of a Lucene index along with all the Solr configuration required to use it. Create a new core called magnolia
:
./solr create_core -c magnolia -d magnolia_data_driven_schema_configs
Use the admin dashboard
By opening http://localhost:8983/solr/, you may use Solr’s admin dashboard. Form there you can also create cores:
Please note that the type of installation described above works for testing and development purposes. For production installation see link:https://solr.apache.org/guide/8_6 taking-solr-to-production.html[Taking Solr to Production^] (Solr 8.6 link).