Forum Discussion

SheenaK's avatar
SheenaK
Director
2 years ago

Pinging Sitemaps using GET requests and/or IndexNow

Posting here before contacting support to see if anyone else has achieved this.

 

We're all familiar with the sitemap, which is updated automatically by Khoros on a daily basis to reflect newly added (or deleted) threads.
 
However, Google doesn't check a sitemap every time a site is crawled; a sitemap is checked only the first time that they notice it, and thereafter only when the site pings them to let them know that it's changed.
 
Due to the size of the Community, waiting for Google to crawl and discover new URLs to index could take longer than is necessary. A more efficient way to speed up the indexing of new content is to notify Google by pinging, which was recently highlighted by Google’s Senior Search Analyst on Twitter.
 

 

There are 2 solutions for this:

1. Sending a GET request in a browser or command line

 

https://www.google.com/ping?sitemap=FULL_URL_OF_SITEMAP

 

 

2. Using IndexNow. This is also a simple ping so that search engines know that a URL and its content have been added, updated, or deleted, allowing search engines to quickly reflect this change in their search results.

 

Has anyone used or set up either of these options on their Khoros community?

  • Thanks for replying cblown . We use GSC extensively, and the tweet I added above is from Google’s Senior Search Analyst / Search Relations team lead, which indicates that simply having a sitemap available is not enough. Regular pings to the sitemap when it is updated is preferred.

    From talking to other community managers it does seem like SEO is declining for communities in recent years, so it would be great to hear from anyone who has managed to combat this effectively.

  • IndexNow works with other search engines such as Bing and results are good (increase in Bing traffic). At the moment, Google will accept URL submissions through the IndexNow protocol for two content types: job postings and broadcast events. My estimate is that, because of the reasons mentioned (internet becoming too vast), Google will start fully using IndexNow at some point in the (near) future.

    In the meantime, XMl sitemaps are a good way to understand what has been indexed. However, this would rely on XML sitemaps that contains all URLs on the site.  It seems that the default sitemaps don't contain all URLs on Khoros (at least OOTB).

    You can normally submit RSS feeds to Google Search Console the same way your submit XML sitemaps. This is another way to inform Google about new URLs as they are created. Normally these are at www.[site].com/rss/Community.

    • JasonLax's avatar
      JasonLax
      Helper

      Hi - The feed at www.[site].com/rss/Community is being processed successfully with 21 URLs.  These 21 Discovered pages: this is a list that refreshes as new content is published. So this is another way to submit fresh content to Google (and Bing) on an ongoing basis. 

      .

  • I recently saw a great YT video on Google Search, it explains how vast the internet has become recently especially in relation to the mass over population of spammy content from naughty seo activity.

    User generated Community content used to organically rank really well in Google. But I suspect that in the process of Google dealing with all this spammy content, they have indirectly impacted sites that generate new content daily. 

    We've found using Google Search Console Tools is a good way check up on any issues with crawling - best follow Google's own guides on this

     

  • Thanks for this SheenaK - thanks for this.

    If you have any newer info I'm interested.

    I've forwarded to my internal SEO sme to see if we do this; if so how and if not should we?

     

    • Lief's avatar
      Lief
      Champion

      SheenaK - i'm told that the original premise (that Google doesn't check the sitemap regularly) is incorrect. Their system checks it regularly for updates and changes.

      Indeed in our Google Search Console (attached image) several of our *node-level* sitemaps were recently checked and we don't implement any of the automatic pings.

      all that to say you may not need to worry about building a specific ping tech - unless perhaps your sitemaps are not being checked anymore? OR if you want to try to get Google to check them more regularly?

      ^THAT^ said, I agree with you that the landscape for SEO and communities is difficult. We currently have some of our nodes that exceed the 50,000 record max for a sitemap and, currently, Khoros Support is saying we can't (they can't) split the sitemap. We are exploring other options for working with large content repos (like sitemap indexing) but still no joy. #WorkInProgress

      Happy to hear from others who consider their approaches to be successful.

       

       

       

  • I doubt this is in Khoros but you could probably shim the realtime firehose with a small bit of custom code to push the notification to google.

    We haven't seen slowness in Google picking up new things though and we've got a pretty big site with a looonng tail of content.

  • mdfw - do you have single locations in your Khoros community with more than 50,000 records in your sitemap? If so - I would love to hear if and how you are sharing all of your content with Google. So far, I can't get around the 50K limitation and that means I'm leaving 10's of thousands of records un-indexed (for SEO).

    Cheers.

    • luk's avatar
      luk
      Boss

      Wouldn't they maybe have been indexed before they were "kicked" out of the back of the sitemap, e.g. when that content was within the 50k limit?

      Regarding the general discussion: How do we know that Khoros is NOT pinging the sitemaps to google when new content is submitted? I think it is somehow doable via customization, but it's gonna be ugly, gonna need an endpoint and some kind of mechanism/logic to call it (maybe on the PostPage, when someone submits a post, hijack the onSubmit event via JS and ping the endpoint that will then ping google with the correct sitemap URL), so I definitely think that should be something that (if it is then actually needed) should be done by the Java-side of things, e.g. by the Khoros platform, not via customization.

      But yes, some clear yesses or noes from official reps would help 😉!

  • luk - It is possible that content *beyond* the 50K limit would have been indexed at one time if it is a slow build.
    In my case - we migrated >80K records; won't work for me.

    Other non-conforming (to that solution) use-cases could be very-high-volume sites, mass-content updates?

    I admit I don't know a ton about how Google decides to index/re-index but I wouldn't be surprised if it finds record 50,000 one week and not the next (when it comes back to check the sitemap and that item has fallen off at position 50,100) I wouldn't be surprised if there is some logic to deprecate/slash that content from their own index.

    For long-lived, or high-volume communities I still think this isn't good - UNLESS I'm just not asking the right question and it is well and truly solved. Khoros Support indicates that split sitemaps is supported and Google's answer is "split the sitemap" - so I think we are still in limbo 😄