How to Keep Google’s Panda from Ruining Your Rankings

It used to be that Google let many crawling problems slide. Not anymore! Their Panda Updates, now almost 3 years old, penalize websites for communicating poorly with Googlebot. Panda 4.0 just rolled out last month, and has gotten quite a bit of press. Here are some tips to prevent a penalty on your clients’ sites. Panda is always evolving, but typically penalizes: “Thin” content: If you heard “thin is in,” think again: Google DISLIKES pages with little content. Before Panda, the recommendation was that articles should be around 250 words in length. After Panda, those were increased to a minimum of 450 words in length. As time has passed, some studies have shown Google favoring pages 1000 words in length! Of course, you shouldn’t sacrifice readability to meet such a quota: Keep content easy to browse and skim. How do you Panda-proof content? Pages should be built out into 450-1000 words. Where that’s not possible, try consolidating content. And don’t forget to 301 redirect the old locations to the new URLs! Duplicate content: Google doesn’t like to find two pages that say the exact same thing. Google doesn’t like to find two pages that say the exact same… well, you get the point. It’s easy for sites to accidentally expose duplicate content to search engines: Tag pages, categories, and search results within a website can all lead to duplicate content. Even homepages can sometimes be found at multiple URLs such as:https://www.hyperdogmedia.com/https://www.hyperdogmedia.com/https://www.hyperdogmedia.com/index.htmlhttps://www.hyperdogmedia.com/index.htmlThis can be very confusing to Googlebot. Which version should be shown? Do the inbound links point to one, but onsite links to another?Never fear, there are easy fixes: a. Block Googlebot from finding the content – Check and fix your internal links. Try to prevent Google from discovering duplicate content during crawling. – Use robots metatags with a “NOINDEX” attribute and/or use robots.txtb. Use 301 Redirects to redirect one location to another. 301 redirects are a special redirect that passes on link authority one from URL to another. The many other kinds of redirects simply send a visitor to a new location, and are usually not the right solution for duplicate content issues.c. Canonical tags can also help These tags help Google sort out the final, canonical URL for content it finds. Where content is on multiple websites, canonical tags are still the solution: They work cross-site! Sitemap.xml files in disarray Google allows webmasters to verify their identity and submit this special xml file full of useful information. Webmasters can list the pages they want Google to index, as well as: – Define their pages’ modification dates – Set priorities for pages – Tell Google how often the page is usually updated Here we are able to actually define what Googlebot has been trying to figure out on its own for eons. But with great power comes great responsibility. For webmasters that submit (or have left submitted) an outdated sitemap.xml file full of errors, missing pages, duplicate or thin content the situation can become dire.The fix? Put your best foot forward and submit a good sitemap.xml file to Googlebot!a. Visit the most likely location for your sitemap.xml file: http://www.domain.com/sitemap.xmlb. Are the URLs good quality content, or is your sitemap.xml file filed with thin, duplicate and missing pages?c. Also check Google Webmaster Tools: Is Google reporting errors with your sitemap.xml file in Webmaster Tools? Large amounts of 404 errors, crawl errors The sitemap.xml file is just a starting point for Google’s crawling. You should certainly have your most valuable URLs in there, but know that other URLs will indeed be crawled. Watch carefully in webmaster tools for crawl errors, and use other crawling tools such as MOZ.com to diagnose your website. Preparing your site for future Panda updates requires thinking like Googlebot. And once a website is in “tip-top shape,” ongoing vigilance is usually needed. In this age of dynamic websites and ever-changing algorithms, you can’t afford to rest! PSST! Need a Free Link?Get a free link for your agency: Would you like our monthly take on the changing world of SEO delivered to your inbox? Subscribe to the Hyper Dog Media SEO Newsletter HERE! When you subscribe, each newsletter will contain a link idea for your business!

5 web development techniques to prevent Google from crawling your HTML forms

Google has recently decided to let it’s Googlebot crawl through forms in an effort to index the “Deep Web”. There are numerous stories about wayward crawlers deleting and changing content through submitting forms, and it’s about to get worse. Googlebot is about to start submitting forms in an effort to get to your website’s deeper data. So what’s a web developer to do? 1. Use GET and POST requests correctly Use GET requests in forms to look up information, use POST requests to make changes. Google will only be crawling forms via GET requests, so following this “Best Practice” for forms is vital. 2. Make sure your POST forms do not respond to GET requests It sounds so simple, but many sites are being exploited for XSS (Cross Site Scripting) vulnerabilities because they respond (and return HTML) to both GET and POST requests. Be sure to check your form input carefully on the backend, and for heaven’s sake – do not use globals! 3. Use robots.txt to keep robots OUT robots.txt file keeps Googlebot out of where it doesn’t belong. Luckily, Googlebot will continue it’s excellent support of robots.txt directives when it goes crawling through forms. Be sure not to accidentally restrict your website too much, however. Keep the directives simple, excluding by directory if possible. And test, test, test in Google’s Webmaster Tools! 4. Use robots metatag directives Using the robots metatag directives for more refined control. We recommend “nofollow” and “noindex” directives for both the form submission page and search results pages you want Google to stay out of, even though Google says disallowing the form submission page is enough. Consider using tags and category pages that are Google friendly instead. 5. Use a CAPTCHA where possible Googlebot isn’t going to fill out a CAPTCHA, so it’s an easy way to make sure some bot isn’t filling out your form. Googlebot is, of course, the nicest bot you can hope to have visit your website. This provides a chance to secure forms and take necessary precautions before other – not so polite – bots visit your forms.

7 untimely ways for a SEO to die

In ancient Rome, the ghosts of the ancestors were appeased during Lemuria on May 9. Not many people know that, and even fewer care. But in the spirit of Lemuria, we offer seven untimely ways a SEO can die(It’s a dangerous world out there, and also I’m low on blog posting ideas): – Bitten by search engine crawlers. – Trampled by googlebots(This is actually the best way to go, if you have to). – Trip over a HTML tag someone forgot to close. (This was funnier last night when I thought of it – go figure) – You get (google)whacked while visiting a bad link neighborhood. – You’re doing the googledance, slip on a banana peel and hit your head. Certainly I’m not the only one who knows the googledance? Please submit your videos if you know it: googledance@hyperdogmedia.com. – You receive a suspicious package in the mail, and it turns out to be a googlebomb. – Setting linkbait traps and you get an arm caught. Please submit any other ideas you might have via email: lemuria@hyperdogmedia.com. So strike up that pun machine, it’s Friday! Update: Debra just suggested you could “overdose on link juice” – if only!