Alexa Certified Site Metrics – AlexaBot 403 Error, Alexa Certify

Home Forums BulletProof Security Free Alexa Certified Site Metrics – AlexaBot 403 Error, Alexa Certify

Viewing 15 posts - 16 through 30 (of 37 total)
  • Author
    Posts
  • #2452
    Dave Smith
    Member

    Hi,
    okay on a hunch I commented out filters 1 & 3 and it worked again.
    So it’s somewhere in there, that’s the limit of my knowledge I’m afraid – over to you!

    #2453
    AITpro Admin
    Keymaster

    That obviously means that some condition or several conditions that are in each of the User Agent security filters is true.  At this point I have decided  to purchase a Alexa Site Metrics account so that I can test and isolate the exact conditions that are being blocked.  I will post back here once I find out exactly what conditions are being blocked.  Just leave all 3 User Agent filters commented out.

    My hunch is:  libwww-perl, wget or java is being used in the Alexa User Agent and one or more of these urlencoded characters – most likely %3C and %3E.

    LF line feed %0A
    CR carriage return %0D
    Single Quote %27
    < left angle bracket
    %3C
    > right angle bracket
    %3E
    NUL null character
    %00
    #2555
    AITpro Admin
    Keymaster

    Click this link:  http://forum.ait-pro.com/forums/topic/alexa-certify/page/2/#post-9696 for the new method of whitelisting the Alexa Bot using BPS Custom Code.

    The word “scan” in the Alexa User Agent string is what is being blocked.  Remove/delete “scan|” from the 2 User Agent Filters shown below in the root .htaccess file.

    Alexa’s “Alexabot” crawler identifies itself in the User Agent as
    “Mozilla/5.0 (compatible; alexa Alexabot/1.0; +http://www.alexa.com/certifyscan; certifyscan@alexa.com)”.
    Ask your webmaster or IT person to make sure that your website returns valid pages to Alexa’s crawler.

    RewriteCond %{HTTP_USER_AGENT} (havij|libwww-perl|wget|python|nikto|curl|scan|java|winhttp|clshttp|loader) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} (;|<|>|'|"|\)|\(|%0A|%0D|%22|%27|%28|%3C|%3E|%00).*(libwww-perl|wget|python|nikto|curl|scan|java|winhttp|HTTrack|clshttp|archiver|loader|email|harvest|extract|grab|miner) [NC,OR]
    #2558
    Dave Smith
    Member

    That’s brilliant, thanks so much for taking the time to work through this and find the solution. Can I ask, what would be a sensible way to return valid pages to the Alexa bot?

    Thanks again,
    Dave

    #2559
    AITpro Admin
    Keymaster

    I don’t understand your question.  Please explain.  Thanks.

    #2563
    Dave Smith
    Member

    Post 2555: “Ask your webmaster or IT person to make sure that your website returns valid pages to Alexa’s crawler”

    Apart from changing the filters, how would you do this?

    #2564
    AITpro Admin
    Keymaster

    The solution was to remove “scan” from the 2 User Agent filters.  This solves the problem so there is no need to find another way of doing this.  😉  If you are asking if it is safe to remove “scan” from the 2 security filters then the answer is Yes.  There is zero risk in removing scan from those security filters because there are many overlapping layers of security in BPS and the User Agent filters are an insignificant first layer of security filters.

    #2567
    AITpro Admin
    Keymaster

    If you are saying that you are having another problem then it would not be related to BPS.  I purchased an Alexa Certified Site Metrics account and set it up successfully.  Go to your Alexa Dashboard, click on Site Management and click on Certification Status and you should see this below.  If not, then you will probably see an error that some of your pages are missing the Certfiy Code.  Did you add the Alexa javascript code directly under your body tag in your Theme’s header.php file?

    SCAN RESULT

    Great Job! Your site metrics are Certified.
    We scanned 158 pages and found no pages missing the Certify Code.

    SITE COVERAGE

    100%

    #3080
    Dave Smith
    Member

    Hi,

    yes it’s working fine now, once again many thanks for taking the time to work through it.

    Thanks,
    Dave

    #9694
    Nick Moustakas
    Participant

    Hello there,

    I face the same problem with Alexa! I can’t get certified metrics for my site. The answer from Alexa is the following:

    “We are showing 0 pages scanned for your site because it is returning a “403 Forbidden” error to our crawler. Please ask your webmaster or IT person to correct the problem with your server so that it does not return an error to the “Alexabot” crawler.

    Alexa’s “Alexabot” crawler identifies itself in the User Agent as “Mozilla/5.0 (compatible; Alexabot/1.0; +http://www.alexa.com/help/certifyscan; certifyscan@alexa.com)”

    So I read this post but I’m confused about the correct order of the steps that to be made.

    The only thing that I’ve done is to delete “scan|” from the 2 User Agent Filters shown below in the root .htaccess file.

    RewriteCond %{HTTP_USER_AGENT} (havij|libwww-perl|wget|python|nikto|curl|scan|java|winhttp|clshttp|loader) [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} (;|<|>|'|"|\)|\(|%0A|%0D|%22|%27|%28|%3C|%3E|%00).*(libwww-perl|wget|python|nikto|curl|scan|java|winhttp|HTTrack|clshttp|archiver|loader|email|harvest|extract|grab|miner) [NC,OR]

    as you mentioned above in other posts. Please inform what else I must do because the problem still exists.

    Wordpress blog: trypies-tsepes.gr

    Thanks for your time,
    Nick

    #9696
    AITpro Admin
    Keymaster

    The newer method using BPS Custom Code would be these steps below:

    1.  Copy the code below (scan has been removed already from this code below) to this BPS Custom Code text box:  CUSTOM CODE BPSQSE BPS QUERY STRING EXPLOITS: Modify Query String Exploit code here
    2.  Click the Save Root Custom Code button.
    3. Go to the BPS Security Modes page and click the Root Folder BulletProof Mode Activate button.
    Note: For good measure clear your Browser cache and if you are using a caching plugin clear your caching plugin cache.

    # BEGIN BPSQSE BPS QUERY STRING EXPLOITS
    # The libwww-perl User Agent is forbidden - Many bad bots use libwww-perl modules, but some good bots use it too.
    # Good sites such as W3C use it for their W3C-LinkChecker. 
    # Use BPS Custom Code to add or remove user agents temporarily or permanently from the 
    # User Agent filters directly below or to modify/edit/change any of the other security code rules below.
    RewriteCond %{HTTP_USER_AGENT} (havij|libwww-perl|wget|python|nikto|curl|java|winhttp|clshttp|loader) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} (%0A|%0D|%27|%3C|%3E|%00) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} (;|<|>|'|"|\)|\(|%0A|%0D|%22|%27|%28|%3C|%3E|%00).*(libwww-perl|wget|python|nikto|curl|java|winhttp|HTTrack|clshttp|archiver|loader|email|harvest|extract|grab|miner) [NC,OR]
    RewriteCond %{THE_REQUEST} (\?|\*|%2a)+(%20+|\\s+|%20+\\s+|\\s+%20+|\\s+%20+\\s+)(http|https)(:/|/) [NC,OR]
    RewriteCond %{THE_REQUEST} etc/passwd [NC,OR]
    RewriteCond %{THE_REQUEST} cgi-bin [NC,OR]
    RewriteCond %{THE_REQUEST} (%0A|%0D|\\r|\\n) [NC,OR]
    RewriteCond %{REQUEST_URI} owssvr\.dll [NC,OR]
    RewriteCond %{HTTP_REFERER} (%0A|%0D|%27|%3C|%3E|%00) [NC,OR]
    RewriteCond %{HTTP_REFERER} \.opendirviewer\. [NC,OR]
    RewriteCond %{HTTP_REFERER} users\.skynet\.be.* [NC,OR]
    RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(http|https):// [NC,OR]
    RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(\.\.//?)+ [NC,OR]
    RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=/([a-z0-9_.]//?)+ [NC,OR]
    RewriteCond %{QUERY_STRING} \=PHP[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12} [NC,OR]
    RewriteCond %{QUERY_STRING} (\.\./|%2e%2e%2f|%2e%2e/|\.\.%2f|%2e\.%2f|%2e\./|\.%2e%2f|\.%2e/) [NC,OR]
    RewriteCond %{QUERY_STRING} ftp\: [NC,OR]
    RewriteCond %{QUERY_STRING} (http|https)\: [NC,OR]
    RewriteCond %{QUERY_STRING} \=\|w\| [NC,OR]
    RewriteCond %{QUERY_STRING} ^(.*)/self/(.*)$ [NC,OR]
    RewriteCond %{QUERY_STRING} ^(.*)cPath=(http|https)://(.*)$ [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*script.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^s]*s)+cript.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*embed.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^e]*e)+mbed.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*object.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^o]*o)+bject.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*iframe.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|%3C)([^i]*i)+frame.*(>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} base64_encode.*\(.*\) [NC,OR]
    RewriteCond %{QUERY_STRING} base64_(en|de)code[^(]*\([^)]*\) [NC,OR]
    RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
    RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2}) [OR]
    RewriteCond %{QUERY_STRING} ^.*(\(|\)|<|>|%3c|%3e).* [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*(\x00|\x04|\x08|\x0d|\x1b|\x20|\x3c|\x3e|\x7f).* [NC,OR]
    RewriteCond %{QUERY_STRING} (NULL|OUTFILE|LOAD_FILE) [OR]
    RewriteCond %{QUERY_STRING} (\.{1,}/)+(motd|etc|bin) [NC,OR]
    RewriteCond %{QUERY_STRING} (localhost|loopback|127\.0\.0\.1) [NC,OR]
    RewriteCond %{QUERY_STRING} (<|>|'|%0A|%0D|%27|%3C|%3E|%00) [NC,OR]
    RewriteCond %{QUERY_STRING} concat[^\(]*\( [NC,OR]
    RewriteCond %{QUERY_STRING} union([^s]*s)+elect [NC,OR]
    RewriteCond %{QUERY_STRING} union([^a]*a)+ll([^s]*s)+elect [NC,OR]
    RewriteCond %{QUERY_STRING} \-[sdcr].*(allow_url_include|allow_url_fopen|safe_mode|disable_functions|auto_prepend_file) [NC,OR]
    RewriteCond %{QUERY_STRING} (;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*(/\*|union|select|insert|drop|delete|update|cast|create|char|convert|alter|declare|order|script|set|md5|benchmark|encode) [NC,OR]
    RewriteCond %{QUERY_STRING} (sp_executesql) [NC]
    RewriteRule ^(.*)$ - [F]
    # END BPSQSE BPS QUERY STRING EXPLOITS
    #9712
    Nick Moustakas
    Participant

    Hi,

    Thanks for the instructions! Now everything works just fine!

    Best Regards,
    Nick

    #9722
    AITpro Admin
    Keymaster

    Great!  Thanks for confirming that everything is working.

    #10485
    Akhil K A
    Participant

    Hi.

    I can’t verify the certification of alexa. It is showing a 403 error. If i disable bulletproof, all works fine. Still my site didn’t get listed in alexa too!

    I tried by removing HEAD from the  root htaccess file, but no use! and the solution in http://forum.ait-pro.com/forums/topic/alexa-certify/ is also not working for me!

    Details:
    Bulletproof plugin version: .49.3
    Issue Website: androidizer.com

    #10488
    AITpro Admin
    Keymaster

    Your topic has been merged.  This is the only known issue with Alexa so double check that you have made the modifications correctly.

Viewing 15 posts - 16 through 30 (of 37 total)
  • You must be logged in to reply to this topic.