Forum Discussion

Warren_Brill's avatar
6 years ago

Custom Component to Implement Multiple Crawls

Per https://community.khoros.com/t5/Developer-Knowledge-Base/Implementing-federated-search-using-a-crawl-list/ta-p/7275, I can implement a crawl of a blog, output is a list of the component article URLs.

Format for crawl URL = https://[community URL]/[community ID]/crawl_boards?board.id=[board ID] .... this works.

However, I want to crawl about 20 blogs, in order to create a sort of site map specific to blogs, specifically for SEO purposes.

I have created a component that has the crawl URLs in a list, and it only crawls the first one, then stops.

I also created a component for each of the crawl URLs, and a component (loosely named after the blog) to call them (just for testing), like:

<@component id="alliancesblog" />
<@component id="storage-block" />
<@component id="labs" />

... but again, only the first one executes.

Is there a way to make more than one crawl execute?

  • DougS's avatar
    DougS
    Khoros Oracle

    If you are trying to add a sitemap to the community, you might want to consider using the community sitemap.xml generation functionality (if you haven't already). It is mentioned in the following knowledge base article:

    https://community.khoros.com/t5/Tactical/Common-questions-about-Lithium-SEO/ta-p/117447?collapse_discussion=true&filter=includeTkbs&include_tkbs=true&q=sitemap&search_type=thread

    Regarding what you are trying to do with the crawl page, I think I may need a little more information from you to get a better understanding of what you are trying to do. Are you trying to point a crawler of some kind at the community crawl pages? I'm not sure I understand what you need the custom components for. Don't you just want to hit the crawl pages directly?

    Thanks,

    -Doug

    • For SEO purposes, I need a simple way to populate custom page that lists all the current blog articles. We do not score well on search engines, for reasons we have been unable to determine, so we are trying to jumpstart better results. I've tried building an API call to do this, and basically get no results. (I have little experience with programming via APIs, and all of the documentation on Lithosphere expects one to already be an expert. However,  I can do a crawl of a blog and get the articles. It's just when the crawl ends, even though the component calls for another crawl, nothing happens. I tried multiple single-crawl components called by a higher-level component (think classic procedural programming), but same thing happens.

      • DougS's avatar
        DougS
        Khoros Oracle

        If you are calling into the community from another application to crawl the boards ,you probably don't want to call a component on a page, but want to call either an API (Here's the REST API V2 getting started page and how to make a REST call via Freemarker section of that page) or an Endpoint.

        If you want to share your component code though, I could possibly offer some advice on what you might want to try.

        Thanks,

        -Doug