Set up sitemap generation with StreamX and AEM
AEM provides built-in support for generating sitemaps. This makes the task straightforward if the platform has full control over the website’s structure. But, when you start dealing with environments that include multiple sources, markets, and projects, things get complicated fast. StreamX event-streaming service mesh has been designed to reduce integration complexity. Its sitemap generation feature zeroes in on streamlining integration and boosting search engine indexing accuracy.
In this tutorial we will set up StreamX sitemap generation with content originated from AEM.
Prerequisites
To complete this guide, you will need:
-
Roughly 15 minutes
-
A running instance of
AEM author 6.5
instance with at leastService Pack 6.5.17
installed, and also with the out-of-the-box We.Retail sample application and content.
If you have an author instance that wasn’t started with the nosamplecontent run mode,
you can safely assume that you have this installed.
You can validate the installation of We.Retail
by visiting We.Retail landing page in the English master.
If it returns a 404 , then you don’t have a proper We.Retail installation.
In this case, please use another (for example a new & fresh) AEM author instance.
|
Ensure no other StreamX instance or any other application is occupying port 8081. |
Step 1: Get the source files
Clone the Git repository containing source files for the example:
git clone https://github.com/streamx-dev/streamx-docs-resources.git
Step 2: Install StreamX OSGi bundles and configuration
To integrate AEM with StreamX you must install StreamX OSGi bundle with all the necessary OSGi dependencies and configuration. Installed and configured OSGi bundle enable feeding StreamX Mesh with AEM sourced data. Follow the steps below to install the package:
-
Upload and install
aem-with-streamx-tutorials/streamx-aem.all-1.0.2.zip
from the cloned project repository
Step 3: Run the StreamX Mesh
-
Open the terminal and go to
generate-sitemap-aem-tutorial
inside the cloned project directory. -
Run the StreamX Mesh by using the following command:
streamx run
-
Wait for the following output:
------------------------------------------------------------------- STREAMX IS READY! ------------------------------------------------------------------- ... ------------------------------------------------------------------- Network ID: ... Mesh configuration file: ./mesh.yaml -------------------------------------------------------------------
Step 4: Publish content from AEM
-
Visit http://localhost:8081/sitemap.xml. Confirm that resource is not available. That’s because we haven’t ingested any content so far.
-
Visit AEM author - Sites admin page - We.Retail United States.
-
Select the
/content/we-retail/us
page and click Manage Publication from the top menu. -
On the next screen (Options) leave the defaults and proceed with Next
-
Action : Publish
-
Scheduling : Now
-
-
On the next screen (Scope), click on the thumbnail of the
/content/we-retail/us
item which will reveal the Include Children option. -
Click the Include Children item and uncheck each checkbox, then confirm your changes by clicking on Add.
-
Finally, click on Publish.
-
Wait for AEM author to complete the publication.
After the publication is done, visit the http://localhost:8081/sitemap.xml again. Now the content should contain all the pages we’ve just published:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://localhost:8081/published/we-retail/us/en.html</loc>
</url>
<url>
<loc>http://localhost:8081/published/we-retail/us/en/about-us.html</loc>
</url>
...
<url>
<loc>http://localhost:8081/published/we-retail/us/en/women.html</loc>
</url>
<url>
<loc>http://localhost:8081/published/we-retail/us/es.html</loc>
</url>
</urlset>