Archive for the 'googlebot' Category

13 Reasons Why Google Loves Blogs

Tuesday, March 10th, 2009

Google loves blogs. What is it about blogs that Google loves so very much? We’ve pinpointed 13 reasons why Google may give – or appear to give – sites with blogs a little extra boost in rankings. Of course, the list is broken down into our framework of looking at good quality sites as being accessible, relevant, and popular.

Accessibility: Search Engine robots must be able to find your content. These reasons help the bots find your postings without a lot of muss or fuss.

1. Pinging
Most blog software sends out a “ping” when there is a new post. Instead of waiting for a search engine crawler to come across your site’s new content – either via a routine crawling or via a link – a notification is sent out to sites like pingomatic, technorati, and even google blog search. This notification tells the search engine robots to come and fetch some fresh (crunchy) content.

2. RSS feeds provide deep links to content
RSS Feeds are useful for so many, many things. They contain links to your latest postings, but also consider that they contain links right to the postings themselves. Even crawlers that aren’t that smart (you know who you are, little bots!) can figure out how to find a link in a list. That’s essentially all an RSS Feed is: A list of links in a predictable format. Hint: You subscribed to your feed in iGoogle, didn’t you?

3. Standard sitemap.xml provide deep links to content
If an RSS feed isn’t enough, use a sitemap.xml file to notify search engines about your site, including any new posts. A great thing about sitemap.xml files is that they can communicate additional information about a link, like how often a search engine robot should visit and what priority the page has in relation to your site.

4. Based on modern HTML design standards
Most blogging software was created or updated very recently, and doesn’t use outdated HTML methods like nested tables, frames, or other HTML methods that can cause a bot to pause.

Relevance: Once found, search engines must be able to see the importance of your content to your desired audience.

5. Fresh content, updated often
Nothing quite gets the attention of a search engine robot like fresh content. It encourages frequent repeat visits from both humans and robots alike!

6. Fresh comments, updated often
Of course, the blogosphere is a very social place. Googlebot is likely to come back often to posts that are evolving over time, with fresh new comments being added constantly.

7. Keyword Rich Categories, Tags, URLs
Invariably, some of your best keywords are likely to be used in the tags and categories on your blog. If you aren’t using keyword rich categories and tags, you really should be.

Popular: Google looks at what other sites link to your site, how important they are, and what anchortext is used.

8. RSS Feeds provide syndication
RSS Feeds can help your content and links get spread all around the internet. Provide an easy path to syndication for the possibility of links and, of course, human traffic.

9. Extra links from blog & RSS Feed directories
The first blog I ever started was for the possibility of a link from a blog directory. But RSS Feed directories exist too! Be sure to maximize the link possibilities by submitting to both.

10. Linking between bloggers / related sites
Blog rolls are links that blogger recommend to their audience. sometimes they have nice, descriptive text and even use XFN to explain relationships between bloggers. Some of your best human traffic can be attained through blogrolls.

11. Social bookmarking technologies built in
Blog posts are usually created with links directly to social bookmarking services like delicious.com, stumbleupon, and other social bookmarking sites. You’ve never made it easier for your audience to share your posting and give you a link!

12. Tagging / Categories with relevant words
Tags can create links to your blog by relevant pages on technorati and other blog search engines. These tag pages sometimes even have pagerank! They deliver keyword rich links and quality traffic.

13. Trackbacks (Conversations)
Trackbacks are conversations spanning several blogs. They are an excellent way to gain links (although often nofollowed these days), and traffic. Other blogs can be part of the conversation, thanks to the trackback system!

Tags: , , , , ,

Post to Twitter Tweet This Post

9 ways Google is discovering the invisible web

Tuesday, July 1st, 2008

There are many parts of the web that Googlebot has not been able to access, but Google has been working to shrink that. Google wants to find content, and while many webmasters do not make it easy, Googlebot finds a way.

1. Crawling flash!
Adobe announced today that they have released technology and information to Google and Yahoo enabling them to crawl flash files. It may take the search engines some time before they are able to integrate and implement these abilities, but a time is coming where rich media is less of a liability. I wonder if MSN/Live was left out to prevent them from reverse engineering Flash for their new silverlight competitor? At any rate, MSN is still working on accessing text links, so let’s not swamp them.

2. Crawling forms
Googlebot recently started filling out forms on the web in an attempt to discover content hidden behind jump menus and other forms. See our previous article if you’d like to keep Google out of your forms.

3. Working with Government entities to make information more accessible
A year or so ago, Google started providing training to government agencies to assist them in getting their information onto the web. I’m assuming much of the information has been hidden by URLs with large amounts of parameters.

4. Crawling JavaScript
Many menus and other dynamic navigation features have been created in JavaScript, and googlebot has started crawling those as well. Instead of relying on webmasters to provide search friendly navigation, Google is finally getting to access sites created by neophyte webmasters that haven’t been paying attention.

5. Google’s patent to read text in images
Google also knows many newbie webmasters use text buttons for navigation. By attempting to read text in images, the Googlebot will once again be able to open up previously inaccessible areas of a site.

6. Inbound links
Of course, Googlebot has always been great at following inbound links to new content. Much of the invisible web has been discovered just through humans linking to a previously unknown resource.

7. Submission
Of course, you can always submit a page location of currently invisible content to Google. This is usually the slowest way, especially compared to inbound links.

8. Google toolbar visits, analytics
Recently, many Denver SEO professionals have noticed links being indexed that have not been submitted. The only plausible explanation was that Google has been mining it’s toolbar and analytics for information about new URLs. Be careful – Google is watching and sees all!

9. Sitemap.xml files
The somewhat new stemap.xml protocol is very helpful for webmasters and googlebots alike in getting formerly invisible content into google’s hands.

Tags: , , , , ,

Post to Twitter Tweet This Post

5 web development techniques to prevent Google from crawling your HTML forms

Friday, April 18th, 2008

Google has recently decided to let it’s Googlebot crawl through forms in an effort to index the “Deep Web”. There are numerous stories about wayward crawlers deleting and changing content through submitting forms, and it’s about to get worse. Googlebot is about to start submitting forms in an effort to get to your website’s deeper data. So what’s a web developer to do?

1. Use GET and POST requests correctly
Use GET requests in forms to look up information, use POST requests to make changes. Google will only be crawling forms via GET requests, so following this “Best Practice” for forms is vital.

2. Make sure your POST forms do not respond to GET requests
It sounds so simple, but many sites are being exploited for XSS (Cross Site Scripting) vulnerabilities because they respond (and return HTML) to both GET and POST requests. Be sure to check your form input carefully on the backend, and for heaven’s sake – do not use globals!

3. Use robots.txt to keep robots OUT
robots.txt file keeps Googlebot out of where it doesn’t belong. Luckily, Googlebot will continue it’s excellent support of robots.txt directives when it goes crawling through forms. Be sure not to accidentally restrict your website too much, however. Keep the directives simple, excluding by directory if possible. And test, test, test in Google’s Webmaster Tools!

4. Use robots metatag directives
Using the robots metatag directives for more refined control. We recommend “nofollow” and “noindex” directives for both the form submission page and search results pages you want Google to stay out of, even though Google says disallowing the form submission page is enough. Consider using tags and category pages that are Google friendly instead.

5. Use a CAPTCHA where possible
Googlebot isn’t going to fill out a CAPTCHA, so it’s an easy way to make sure some bot isn’t filling out your form.

Googlebot is, of course, the nicest bot you can hope to have visit your website. This provides a chance to secure forms and take necessary precautions before other – not so polite – bots visit your forms.

Tags: , , , , ,

Post to Twitter Tweet This Post

7 untimely ways for a SEO to die

Friday, May 11th, 2007

In ancient Rome, the ghosts of the ancestors were appeased during Lemuria on May 9. Not many people know that, and even fewer care. But in the spirit of Lemuria, we offer seven untimely ways a SEO can die(It’s a dangerous world out there, and also I’m low on blog posting ideas):

- Bitten by search engine crawlers.

- Trampled by googlebots(This is actually the best way to go, if you have to).

- Trip over a HTML tag someone forgot to close. (This was funnier last night when I thought of it – go figure)

- You get (google)whacked while visiting a bad link neighborhood.

- You’re doing the googledance, slip on a banana peel and hit your head. Certainly I’m not the only one who knows the googledance? Please submit your videos if you know it: googledance@hyperdogmedia.com.

- You receive a suspicious package in the mail, and it turns out to be a googlebomb.

- Setting linkbait traps and you get an arm caught.

Please submit any other ideas you might have via email: lemuria@hyperdogmedia.com. So strike up that pun machine, it’s Friday!

Update: Debra just suggested you could “overdose on link juice” – if only!

Tags: , , , , ,

Post to Twitter Tweet This Post