On every website, some kind of search functionality is considered as a basic feature. The larger and more complex a website is, the more important is the possibility to search on a website. With an easy to handle extension, it is possible to connect TYPO3 to the search server Apache Solr. Read more about the advantages and how to run your first search …
Introduction to search solutions
The larger and the more complex a website is, the more important is a prominent search field and a decent search functionality behind it. For TYPO3, there are several search solutions around. Each one has its own (dis-) advantages.
Overview over search solution of TYPO3
Google Custom Search
Google provided a search component to be implemented in an website, that returned only the results belonging to a specific domain. EOL (End of Life) of this product is April 1st 2018. Same is true for Google Enterprise Search and its appliances.
Indexed Search and Crawler
The system extension “Indexed Search” was there from the beginning of TYPO3. It is still delivered with the TYPO3 Core, but it basically allows only the indexing of TYPO3 pages. This is where the extension “crawler” stepped in and made it possible to index other tables than the TYPO3 core ones. But to be honest, the combination “Indexed Search”, “Crawler” and I never became friends in the past.
The approach of the TYPO3 extension “ke_search” is quite similar to EXT:indexed_search: All data is kept in a TYPO3 database table and the queries are run against it.
“ke_search” makes it quite straight forward to write on own indexers for custom tables. But it only fills two columns “title” and “content” against the mySQL full text search is executed. Thus relevance scoring is done by mySQL and cannot be influenced by you. The scoring is a black box.
At the TYPO3 University Days 2017 the extension “mk_search” was presented as a whole in one frontend to “all” search backends. Unfortunately the documentation is only available in german and many links to chapters do not work.
Just a couple of days ago, a new extension was published “search_core”. It is an extension in early beta state, that connects TYPO3 to elastic search.
Last but not least there is the TYPO3 extension “solr”. It solves many (all?) issues the other search solutions have in this list:
- completely Open-Source
- very flexible indexing
- scoring can be defined
- long development history with a plan for the futiure
- integration with TYPO3 is quite easy
In the remaining part of this post, Apache SolR and its integration with TYPO3 is the focus. There will be two more posts about pushing content to Apache SolR and how to influence the scoring in the search result.
But now let’s go on with the basics of Apache SolR and the use within TYPO3.
Apache SolR is a service, like the Apache HTTPD or the mail server Postfix. Instead of returning html pages or forwarding emails, its aim is to provide solid and valuable search results to users.
One requirement is to have an instance of a Apache SolR server running. There are at least three possibilities:
Self-Hosted Apache SolR
The most flexible way is self-hosting Apache SolR. Possible installation methods are
– native on the server
– “virtualizing” with a docker container or
– (in case of MacOS) using homebrew
The self hosted Solr servers must use the Solr configuration that comes with the extension in the folder “EXT:solr/Resources/Private/solr”. Depending on the available Solr version another configuration set must be used.
A complete version matrix is available in the docs.
Choosing the right web hosting provider
Some web hosters like jweiland.net are offering hosting packages that include Apache SolR. Starting with the premium plan, two Apache solr cores are included. More can be booked for a reasonable price.
Using a specialized provider
If both variants do not fit, you can book Apache Solr as a hosted service. With Hosted Solr the Frankfurt based company dkd offers such a simple to use service. They offer also a free testing period of 30 days. This should be enough to get kickstarted with Apache Solr.
But now, these were enough ads. ;-) Let’s have a look at the Solr extension itself.
EXT:solr for TYPO3 – A Kickstart Guide
In the next couple of lines, I will guide you with some kind of kicktart guide. You will learn to manage your first steps with TYPO3 and Apache Solr successfully.
Step 1: Install the extension
The first step is (of course) the installation of the extension “solr”. It is available via the extension manager (EM) or via composer.
Step 2: Configure it
A basic configuration needs just the following five steps. Some of them you might have done already in your existing installation.
Set a sys_domain record
At first you must define a sys_domain record, for the domain you want to index. This is needed to tell EXT:solr, which domain should be used while indexing and searching.
“Use as site root”
The page, where the sys_domain record resides, must also have the option “Use as site root” set. The result is, that the earth symbol is displayed in the page tree, right before the the page title.
Add basic TypoScript settings
On the TypoScript level, your must include the the static TypoScript template to the template in the module “Template”. Furthermore you provide the connection information to your Solr instance within the TypoScript constants. The following values should be enough:
scheme = https
host = localhost
port = 8983
path = /solr/core_en/
Please replace the dummy values, with your actual ones ;-)
Initialize Solr connections
After the installation a new entry was added to the clear cache drop down: “Initialize Solr connections”. A click on this button checks the availability of the Solr server.
Check in the info module
If every thing went ok, the info module of the Solr extension shows, everything green. If you see other messages than in the screenshot, you can check the section “Status Report” in the module “Reports”. There is a more detailed detailed diagnose of the possible causes.
Step 3: Start indexing
All green? If so, you can now start with the indexing your website. Go to the Apache Solr module “Index Queue”, activate the checkbox before “pages” and click the the button “Queue Selected Content for Indexing”. Now it is time to click the button “Index now”. After a couple of seconds, you will have the first pages in the Solr index.
Step 4: Insert the plugin
The next step is to insert the search plugin on a page. For the first tests you should definitely use “Search: Form, Result, Additional Components”, as this plugin provides all components needed for a search. You must not provide any further settings in the plugin.
Step 5: Start searching
Last but not least, you can start searching right away! Here is an example screenshot based on the official TYPO3 introduction package:
I hope this primer gave you a rough idea, how easy it is to run a search with TYPO3 and Apache Solr. Steps 2 to 5 from the section above gave a good overview, what the extension does:
– taking care of communication parameters
– pushing data to the Solr server
– getting data from the Solr server
– rendering of search functions in the frontend
– providing tools for index administration
The configuration and handling of the TYPO3 extension is very good documented in the extension manual. If you have set up already some extensions, this should be quite easy.
At least for me the tricky part was, to understand what happens with the data on it’s way to and from the index. This will be the topic of the next post in this series. The third post about Apache Solr and TYPO3 will be about understanding and influencing the score of the search results.
I want to thank my supporters, who make this blog post possible. For this blogpost I welcome DKD Internet Services as a Platinum sponsor.
If you also appreciate my blog and want to support me, you can say “Thank You!”. Find out the possibilities here: