Set up search with StreamX and AEM

In this tutorial we will set up StreamX search with AEM used for datasource. We will use the We.Retail sample application and content which comes pre-installed with non-production AEM author installations.

Prerequisites

To complete this tutorial, you will need:

  • roughly 15 minutes

  • StreamX CLI installed

  • Git installed

  • jq installed

  • A running instance of AEM author 6.5 instance with at least Service Pack 6.5.17 installed, and also with the out-of-the-box We.Retail sample application and content.

If you have an author instance that wasn’t started with the nosamplecontent run mode, you can safely assume that you have this installed. You can validate the installation of We.Retail by visiting We.Retail landing page in the English master. If it returns a 404, then you don’t have a proper We.Retail installation. In this case, please use another (for example a new & fresh) AEM author instance.

Step 1: Get the source files

Clone the Git repository containing source files for the example:

git clone https://github.com/streamx-dev/streamx-docs-resources.git

Step 2: Install StreamX OSGi bundles and configuration

To integrate AEM with StreamX you must install StreamX OSGi bundle with all the necessary OSGi dependencies and configuration. Installed and configured OSGi bundle enable feeding StreamX Mesh with AEM sourced data. Follow the steps below to install the package:

  1. Visit AEM author - CRX Package Manager

  2. Upload and install aem-with-streamx-tutorials/streamx-aem.all-1.0.2.zip from the cloned project repository

Step 3: Run the StreamX Mesh

  1. Open the terminal and go to search-with-streamx-and-aem-tutorial inside the cloned project directory.

  2. Run the StreamX Mesh by using the following command:

    streamx run
  3. Wait for the following output:

    -------------------------------------------------------------------
    STREAMX IS READY!
    -------------------------------------------------------------------
    ...
    -------------------------------------------------------------------
    Network ID:
    ...
    Mesh configuration file: ./mesh.yml
    -------------------------------------------------------------------

Step 4: Publish content from AEM

At this point we are ready to push content into the StreamX Service Mesh. Because of the proper StreamX Connector OSGi bundles and OSGi configuration installed, content ingestion simply means the good old activation/publication/replication we are used to. On replication events, the `StreamX Connector`s takes care of collecting and sending the content to the StreamX Service Mesh.

  1. Visit AEM author - Sites admin page - We.Retail United States

  2. Select page /content/we-retail/us and from the top menu, click on Manage Publication

  3. On the next screen (Options) leave the defaults and proceed with Next

    1. Action : Publish

    2. Scheduling : Now

  4. On the next screen (Scope) click on the thumbnail of the /content/we-retail/us item which will reveal the Include Children option

  5. Click on the Include Children item and deselect every checkbox, then confirm your changes by clicking on Add

  6. Finally, click on Publish

  7. Wait for AEM author to complete the publication

In the background, AEM sends the published content to the StreamX Service Mesh, which feeds data to the search service.

Step 5: Search content with StreamX

The search service defined and already started within the StreamX Service Mesh listens at http://localhost:8082/search.

  1. Search for Equipment and limit the results to 40 items, executing the following command in the terminal:

    curl -s 'http://localhost:8082/search/byQuery?size=40&query=equipment*' -H 'Accept: */*' --insecure | jq '.hits.hits | length'

    As the result, you should see the number 40, indicating that we received at least 40 hits across all We.Retail pages. The plain output of the search service is not easy to read, so we’ve used jq to count the number of matches for easier understanding.

Step 6: Update content

Now let’s update AEM content and see how it reflects in the search service.

  1. First, verify that we don’t have any hit for the search term StreamX: execute the following command and verify its output is 0

    curl -s 'http://localhost:8082/search/byQuery?size=40&query=streamx*' -H 'Accept: */*' --insecure | jq '.hits.hits | length'
  2. Visit the AEM author - US EN Equipment Blueprint page

  3. Open up the edit dialog of the Title component (under the Hero Image) saying Welcome our finest equipment and replace Welcome our finest equipment with Hello StreamX from AEM!

  4. Confirm your changes on the edit dialog

  5. Use the Rollout Page option from the page action bar, and roll out to the two existing live copies

  6. Visit AEM author - US EN Equipment page

  7. Use the Publish Page option from the page action bar

  8. Finally, execute the search again:

    curl -s 'http://localhost:8082/search/byQuery?size=40&query=streamx*' -H 'Accept: */*' --insecure | jq '.hits.hits | length'

You should see a JSON message with a single entry, meaning that our just-updated content was found to be a match by StreamX.

Summary

Congratulations! You have set up the StreamX search with AEM used for datasource.