How Disallowing CSS and Javascript Crawling Affects Rankings in Panda

There’s been some general confusion on whether or not disallowing crawling for css and javascript might negatively impact rankings via the Panda algorithm. The answer from Google has been definitely not and it certainly does. The confusion has been that Google maintains there isn’t a straight forward answer to the question.

That being said, there are still reasons to consider whether or not you wish to allow or disallow crawls of your css or javascript. On the negative side, you may be pulling in a lot of content that isn’t yours. There are a host of reasons this could be a concern. Foremost, you may wish to differentiate yourself from non-local content. Automated crawling cannot reasonably pick what is local content from foreign born. So, not allowing crawling may be a way to logically divide, for a crawler’s benefit – what is local and what is not.CSS & JS Panda

On a more complicated level, you may be protecting yourself from legal and illegal content usage. At the end of the day, a crawler can only discern what is integral to your content – how things appear visually on page. A crawler cannot detect foreign content that is being used, logistically, for limited purposes and that cannot be used for broader ones. This is where you may need to step in for more fine tuned, detailed culling of what can be garnered via crawling.

The response from Google regarding allowing or disallowing crawling is a bit mixed as well. They assure people that disallowing crawling over css and javascripts will not adversely affect ranking. However, they also point out that it can damage how Google interprets a site. Browser-reactive sites, that is sites which actively change layout based on the browser (formatting for smart phones, for instance), will not be interpreted visually and properly by Google if css crawling is disallowed.

Regarding Panda itself, their stance is that they try and gauge the overall quality of a page or website. That is to say, the most important aspect is the general content and quality of what is being crawled through. This includes such things as content itself as well as whether the HTML is correct and all links are properly formatted. Aesthetics governed by css and javascript are more of a technical/layout concern and Panda doesn’t much care how this is done. Thus, the Panda algorithm largely ignores these elements anyways.

It should be noted though, that Google representatives say this isn’t a “primary” factor in its quality algorithm. They do not say it has no effect at all. In fact, Google’s official documentation recommends unblocking css and javascript for a scan. Part of the issue may be with its new Fetch and Render technology.

Google hasn’t officially responded to comments regarding this, but it seems if they are using render technology to get a visual approximation of a website, it could very much look like a broken page if viewed without accompanying style sheets and javascript. Again, Google has not confirmed this to be the case but they have also not been very forthcoming and direct with how much impact disabling crawls can have on a site’s rating.

One factor to keep in mind though, is that Panda is always a work in progress. Google continually fine tunes their algorithms to make the service more accurate. What that means, from your perspective, is that while it may be alright now to disallow crawls over css and javascripts – that does not mean it will be alright in the future.

As evidence of this, many people have complained that their rankings dropped specifically with the Panda 4.0 release. Indeed, in that particular thread, the Google representative admitted that disallowing access to any URL which affects visual display significantly should be avoided. So, that’s the rather confusing, short of it – Google maintains that disallowing crawls will not significantly impact the Panda algorithm’s ranking while at the same time they say that disallowing it may make the site difficult to “read”. It could not be much less clear than that. But, Google always tends to be closed about specifics of their algorithms to help protect them from people trying work the system.

Bearing that in mind, it may be the safest of all options to just allow the crawls instead of risk the negative impact. If you find that unacceptable, a middle ground might be to allow access to the ones which are central to visual formatting, so Panda can reassemble the page properly when it retrieves it. But going with the safe as possible route, it is probably best to simply allow them if you can. And, of course, if it is a planned scan you are prepping your site for – you can always allow crawling just for the time it takes to perform the scan them turn disallow back on after it is complete.