Public
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Genius

Search Spam Bot?

I'm putting together a metrics report on off-season activity and just noticed that our community search was rather inundated with junk searches over the summer. We received thousands of searches for the terms "_change_me_" and "1". Of course many were unsuccessful because they were searching from specific boards, categories, etc.

 

Anyone else experience this or know who was behind it? More curiosity at this point than anything as I'll be manually adjusting the search results to remove the junk.

 

Thanks.

Jillian Bejtlich
Community Architect
The Community Roundtable
10 Replies 10
Leader

I also have "_change_me_" as the highest unsuccessful search term in my Community, by several orders of magnitude. 

 

Strange.



-Brien

Khoros Alumni (Retired)

Feeling left out since here on the Lithium Community we haven't seen such searches since June 2017 🙁


Normally the explanation for many of these odd search terms are related to UX. E.g. community members are searching for other members using the content search, they are searching for a community setting using search or the search scope is too narrow within a community sub-node to actually yield results.

 

But: The fact that you are seeing this across different communities suggests indeed suggests some automated entity to be at work. This could be a script verifying that spam content actually made it to the community or a content scraper that went rogue. The best way to deal with these anomalies indeed is to manually clean up the report.

 

Should they be persisting you could try to identify and lock out these visitors. If that's worth the effort depends on whether their visits are just a nuisance or they are causing a lot of traffic. To identify these users you would use your web analytics tool to look at high traffic volumes from individual IPs.


Khoros Best Practice until August 2019. Onwards posting as Claudius.
Learn how to master Khoros. Learn Best Practice in the Community Documentation
If you appreciate my efforts, please give me a kudo ↓
Accept as solution to help others find it faster.

Hi @ClaudiusH - 

 

We've also seen this problem - we got hit with some sort of automated search bot around Oct 20, and it's making our reports look sort of crazy. See screenshot for more context.

 

You mention manually cleaning up the report - how would I do this?

 

Cheers!
- Caroline

 

image.png

 

Thanks!

Glad (in a weird way) that we're not alone on the weird searches. It appears to be a search crawler called Arachni, but that's about as far as I got with figuring out who/what was behind the inflated search results.

 

As for scrubbing search results, we exported our searches with and without results to Excel, and removed the ones that said _change_me_, 1, Arachni, or some variation of it. Then we summed up the total searches again, re-calculated successful and unsuccessful, and that was that (with a notation on our monthly dashboard).

 

Either way, I hope it goes away. Scrubbing results takes time!

Jillian Bejtlich
Community Architect
The Community Roundtable

Thanks for the info, Jillian! I was hoping to have some way to scrub the data that appears in LSI, but I imagine that's not really possible. I'll use a similar offline process to what you described.

Glad it's not just us, also!

As far as preventing this in the future - is there any way to limit the number of searches a particular IP address can do within a certain amount of time? It looks like just about all of these searches came from an anonymous user in Ashburn, VA, and were concentrated within just a few minutes.

Khoros Alumni (Retired)

The manual cleaning I was referring to is exactly the process you described to verify and remove irrelevant search terms and then re-calculate percentages in a spreadsheet.

 

Currently there is no way to limit search usage by certain criteria. Maybe it's worth to create a "search flood prevention" funtionality similar to the message flood prevention. If you are still seeing this search spam persisting on your community and it's placing a burden on your search reporting I suggest submitting a product enhancement idea and referencing this discussion.


Khoros Best Practice until August 2019. Onwards posting as Claudius.
Learn how to master Khoros. Learn Best Practice in the Community Documentation
If you appreciate my efforts, please give me a kudo ↓
Accept as solution to help others find it faster.

Seems more spam is coming from Ashburn. https://www.google.com/search?ei=w2YNWuDHCsvKwQLMppvAAQ&q=Ashburn+spam&oq=Ashburn+spam&gs_l=psy-ab.3...

By the way we are also seeing the search term number 1 in our search results but only recently. We've had another issue open whereby a specific search term increased up to an average of 6000. The numbers for the specific search term are very very high and very unusual compared to our historical search reports in LSI.
To compare, the nr 2 search terms ranges from 600-900 results....and this specific search term has been the nr 1 search term since Maywith on average 6000 hits.
We also like to know what's behind this number since it is unlikely to be representing "customer" traffic.
Good luck!
Wendy

Learning from others and helping where I can!
Community Passionista!

Thanks, @ClaudiusH!

Hi @Wendy_S - 

 

We also saw another - much smaller - burst of this spammy search Nov 13 and 14. Seems like it'll be a recurring thing until there's a way to limit search usage or some other blocking techniques for anonymous users. Yuck.

 

Regards,

- Caroline

Welcome to the Technology board!

Curious about our platform? Looking to connect on social technology? You've come to the right place!

Are you a Khoros customer? For direct assistance from our Support team, please visit the Support Forum.