Amazon Bot unable to crawl Native Shopping Ads

Home Forums BulletProof Security Pro Amazon Bot unable to crawl Native Shopping Ads

Viewing 9 posts - 1 through 9 (of 9 total)
  • Author
    Posts
  • #34396
    carsafety
    Participant

    Hello, I just received the email below from Amazon.  I do not have these lines in either of my website’s robots.txt file.  Is it possible these are incorporated by BulletProof security?  If so, how do I resolve this please?

    For Recommendation Ads to work effectively, our crawlers need to index your pages. We noticed your website does not allow Amazon to crawl itself, which is leading to problems in loading Native Shopping Ads.

    Here’s what you need to do to fix this problem:

    The Amazon crawlers can be identified by the following user agent string ‘Mozilla/5.0 (compatible;contxbot/1.0)’. To update your robots.txt file to grant our crawler access to your pages, remove the following two lines of text from your robots.txt file:

    User-agent: Mozilla/5.0 (compatible;contxbot/1.0)

    Disallow: /

    Alternately, just put this in these two lines of text:

    User-agent: Mozilla/5.0 (compatible;contxbot/1.0)

    Disallow:

    #34400
    AITpro Admin
    Keymaster

    I don’t think the Amazon error message is accurate.  Check your BPS Security Log for any log entries that have the Amazon Bot User Agent:  User-agent: Mozilla/5.0 (compatible;contxbot/1.0) and post 1 of them in your forum reply.

    #34401
    carsafety
    Participant

    Nothing like that in the security log, however I would estimate that around half of the log entries are blocking host amazonaws.com like this entry below.  Perhaps related?

    [403 GET Request: October 21, 2017 - 3:32 pm]
    BPS Pro: 13.3.3
    WP: 4.8.2
    Event Code: BFHS - Blocked/Forbidden Hacker or Spammer
    Solution: N/A - Hacker/Spammer Blocked/Forbidden
    REMOTE_ADDR: 54.157.163.187
    Host Name: ec2-54-157-163-187.compute-1.amazonaws.com
    SERVER_PROTOCOL: HTTP/1.1
    HTTP_CLIENT_IP:
    HTTP_FORWARDED:
    HTTP_X_FORWARDED_FOR:
    HTTP_X_CLUSTER_CLIENT_IP:
    REQUEST_METHOD: GET
    HTTP_REFERER:
    REQUEST_URI: /3966/the-5-step-test/comment-page-2/
    QUERY_STRING:
    HTTP_USER_AGENT: curl/7.26.0
    #34402
    AITpro Admin
    Keymaster

    The User Agent string is being blocked since “curl” is in the UA String. Do the steps below.

    1. Copy the modified BPS Query String Exploits code below (curl has been removed from the code below) to this BPS Root Custom Code text box: CUSTOM CODE BPSQSE BPS QUERY STRING EXPLOITS
    2. Click the Save Root Custom Code button.
    3. Go to the BPS Security Modes page and click the Root Folder BulletProof Mode Activate button.

    # BEGIN BPSQSE BPS QUERY STRING EXPLOITS
    # The libwww-perl User Agent is forbidden - Many bad bots use libwww-perl modules, but some good bots use it too.
    # Good sites such as W3C use it for their W3C-LinkChecker.
    # Use BPS Custom Code to add or remove user agents temporarily or permanently from the
    # User Agent filters directly below or to modify/edit/change any of the other security code rules below.
    RewriteCond %{HTTP_USER_AGENT} (havij|libwww-perl|wget|python|nikto|scan|java|winhttp|clshttp|loader) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} (%0A|%0D|%27|%3C|%3E|%00) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} (;|<|>|'|"|\)|\(|%0A|%0D|%22|%27|%28|%3C|%3E|%00).*(libwww-perl|wget|python|nikto|scan|java|winhttp|HTTrack|clshttp|archiver|loader|email|harvest|extract|grab|miner) [NC,OR]
    RewriteCond %{THE_REQUEST} (\?|\*|%2a)+(%20+|\\s+|%20+\\s+|\\s+%20+|\\s+%20+\\s+)(http|https)(:/|/) [NC,OR]
    RewriteCond %{THE_REQUEST} etc/passwd [NC,OR]
    RewriteCond %{THE_REQUEST} cgi-bin [NC,OR]
    RewriteCond %{THE_REQUEST} (%0A|%0D|\\r|\\n) [NC,OR]
    RewriteCond %{REQUEST_URI} owssvr\.dll [NC,OR]
    RewriteCond %{HTTP_REFERER} (%0A|%0D|%27|%3C|%3E|%00) [NC,OR]
    RewriteCond %{HTTP_REFERER} \.opendirviewer\. [NC,OR]
    RewriteCond %{HTTP_REFERER} users\.skynet\.be.* [NC,OR]
    RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(http|https):// [NC,OR]
    RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(\.\.//?)+ [NC,OR]
    RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=/([a-z0-9_.]//?)+ [NC,OR]
    RewriteCond %{QUERY_STRING} \=PHP[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12} [NC,OR]
    RewriteCond %{QUERY_STRING} (\.\./|%2e%2e%2f|%2e%2e/|\.\.%2f|%2e\.%2f|%2e\./|\.%2e%2f|\.%2e/) [NC,OR]
    RewriteCond %{QUERY_STRING} ftp\: [NC,OR]
    RewriteCond %{QUERY_STRING} (http|https)\: [NC,OR]
    RewriteCond %{QUERY_STRING} \=\|w\| [NC,OR]
    RewriteCond %{QUERY_STRING} ^(.*)/self/(.*)$ [NC,OR]
    RewriteCond %{QUERY_STRING} ^(.*)cPath=(http|https)://(.*)$ [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*script.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^s]*s)+cript.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*embed.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^e]*e)+mbed.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*object.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^o]*o)+bject.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*iframe.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^i]*i)+frame.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} base64_encode.*\(.*\) [NC,OR]
    RewriteCond %{QUERY_STRING} base64_(en|de)code[^(]*\([^)]*\) [NC,OR]
    RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
    RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2}) [OR]
    RewriteCond %{QUERY_STRING} ^.*(\(|\)|<|>|%3c|%3e).* [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*(\x00|\x04|\x08|\x0d|\x1b|\x20|\x3c|\x3e|\x7f).* [NC,OR]
    RewriteCond %{QUERY_STRING} (NULL|OUTFILE|LOAD_FILE) [OR]
    RewriteCond %{QUERY_STRING} (\.{1,}/)+(motd|etc|bin) [NC,OR]
    RewriteCond %{QUERY_STRING} (localhost|loopback|127\.0\.0\.1) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|>|'|%0A|%0D|%27|%3C|%3E|%00) [NC,OR]
    RewriteCond %{QUERY_STRING} concat[^\(]*\( [NC,OR]
    RewriteCond %{QUERY_STRING} union([^s]*s)+elect [NC,OR]
    RewriteCond %{QUERY_STRING} union([^a]*a)+ll([^s]*s)+elect [NC,OR]
    RewriteCond %{QUERY_STRING} \-[sdcr].*(allow_url_include|allow_url_fopen|safe_mode|disable_functions|auto_prepend_file) [NC,OR]
    RewriteCond %{QUERY_STRING} (;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*(/\*|union|select|insert|drop|delete|update|cast|create|char|convert|alter|declare|order|script|set|md5|benchmark|encode) [NC,OR]
    RewriteCond %{QUERY_STRING} (sp_executesql) [NC]
    RewriteRule ^(.*)$ - [F]
    # END BPSQSE BPS QUERY STRING EXPLOITS
    #34405
    carsafety
    Participant

    thanks, that seems to have solved that set of errors, but immediately after making that change I received a warning from jetpack that my website was down (it wasn’t) and now I see these filling up the log:

    [405 HEAD Request: October 22, 2017 - 2:30 pm]
    BPS Pro: 13.3.3
    WP: 4.8.2
    Event Code: BFHS-HEAD - HEAD Request Blocked
    Solution: https://forum.ait-pro.com/forums/topic/security-log-event-codes/
    REMOTE_ADDR: 192.0.101.226
    Host Name: wordpress.com
    SERVER_PROTOCOL: HTTP/1.1
    HTTP_CLIENT_IP:
    HTTP_FORWARDED:
    HTTP_X_FORWARDED_FOR:
    HTTP_X_CLUSTER_CLIENT_IP:
    REQUEST_METHOD: HEAD
    HTTP_REFERER:
    REQUEST_URI: /
    QUERY_STRING:
    HTTP_USER_AGENT: jetmon/1.0 (Jetpack Site Uptime Monitor by WordPress.com)
    
    [405 HEAD Request: October 22, 2017 - 2:30 pm]
    BPS Pro: 13.3.3
    WP: 4.8.2
    Event Code: BFHS-HEAD - HEAD Request Blocked
    Solution: https://forum.ait-pro.com/forums/topic/security-log-event-codes/
    REMOTE_ADDR: 54.217.201.243
    Host Name: ec2-54-217-201-243.eu-west-1.compute.amazonaws.com
    SERVER_PROTOCOL: HTTP/1.1
    HTTP_CLIENT_IP:
    HTTP_FORWARDED:
    HTTP_X_FORWARDED_FOR:
    HTTP_X_CLUSTER_CLIENT_IP:
    REQUEST_METHOD: HEAD
    HTTP_REFERER:
    REQUEST_URI: /
    QUERY_STRING:
    HTTP_USER_AGENT: jetmon/1.0 (Jetpack Site Uptime Monitor by WordPress.com)
    #34406
    AITpro Admin
    Keymaster

    Run the Pre-Installation Wizard and Setup Wizard again. Then recheck your BPS Query String Exploits Custom Code and make sure “curl” is still deleted. If it is not deleted then manually delete it from these 2 security rules as shown below.

    RewriteCond %{HTTP_USER_AGENT} (havij|libwww-perl|wget|python|nikto|scan|java|winhttp|clshttp|loader) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} (;|<|>|'|"|\)|\(|%0A|%0D|%22|%27|%28|%3C|%3E|%00).*(libwww-perl|wget|python|nikto|scan|java|winhttp|HTTrack|clshttp|archiver|loader|email|harvest|extract|grab|miner) [NC,OR]
    #34407
    carsafety
    Participant

    Ran both and they completed successfully.  Jetpack sent an update that my site was back up again a couple minutes later, so I assume the issue is resolved.   curl did not appear in the custom code after the wizards.  Thanks!

    #34408
    AITpro Admin
    Keymaster

    Great!  Thanks for confirming that “curl” remained deleted from your Query String Exploits Custom Code. And thank you for the PayPal donation.  That was very generous of you.  🙂

    #34411
    carsafety
    Participant

    Only a couple security log entries since running the wizards, and both looked like the suspicious ones that should have been blocked.

    Thank you again for the fast support, even when the problem turns out to only be indirectly related to BPS pro.

Viewing 9 posts - 1 through 9 (of 9 total)
  • You must be logged in to reply to this topic.