All questions with the tag "Crawling":

Question Views
1. Should I disallow Googlebot from crawling slower pages? 1,670
2. How effective is Google now at handling content supplied via Ajax? 1,632
3. Uncrawled URLs in search results 1,433
4. Links from relevant and important sites have always been a great way to get traffic and acceptance for a website. How do you rate links from new platforms like Twitter, Facebook to a website? 1,361
5. Is first link priority an on-page SEO factor? 1,347
6. Will Google consider Yahoo! Directory and BOTW as sources of paid links? 1,311
7. How can I make sure that Google reaches and indexes pages that are on a lower (deeper) level of a website? 1,285
8. As Google's algo evolves, is it better to have exceptional links and mediocre content, or exceptional content and mediocre links? 1,257
9. HTML sitemap vs. XML sitemap. Which one is yummy for Google search engine spider? 1,236
10. Regarding "nofollow" on internal links: Does it hurt? 1,197
11. In regards to the new canonicalization tag, does it make sense for large corporations to consider placing that tag on every page due to marketing tracking codes and large levels of duplicate URLs like faceted pages and load balancing servers? 1,195
12. Is a website designed with a CSS-based layout more SEO friendly than a table-based layout? 1,195
13. We are a pretty big site. We are changing our hosting company in the next few weeks (same country). Should we be scared from an SEO perspective? 1,191
14. How does Google determine domain age, and is it important for ranking? 1,169
15. The Sitemap.xml file states there are 10,000 URLs but only 1500 have been indexed. After numerous crawls it does not appear Google is going to index these additional detail pages. What can I do to get Google to index my unique and current detail pages. 1,166
16. PHP performance tips 1,163
17. What are your views on 'PageRank sculpting'? 1,152
18. Is there a limit to the number of pages that Google will index from one site? 1,152
19. Why does Google crawl/index blogs (specifically sites notified by "WordPress XMLRPC pings") so much faster than a "normal" site submitting a revised Sitemap. What is the impact of that on the overall "quality" of the index? 1,151
20. How much time is Google taking to index a new webpage, and how can we accelerate the process besides using Google Webmaster Tools? 1,148
21. What impact do site load times have on Google rankings? 1,145
22. Should a "Sale Page" be in a robots.txt file to avoid duplicate content? 1,145
23. Is it a good thing to put 'nofollow' in links to a disclaimer, privacy statement and other pages like that with the internal PageRank in mind? 1,141
24. Can Google provide a way to mark a section of our pages as being less important for being indexed/snippeted by Google? 1,140
25. An orphanage website I work on is showing up for searches on "girls in bathrooms" because they have an article about renovating the girls' bathroom! What do you think of the idea of a negative keyword meta tag to block irrelevant searches? 1,134
26. How does Google rank sites which run on a different port than the standard port 80? 1,132
27. Are there any APIs available from Google to pullout reports from Google Webmaster Tools? 1,119
28. If Google crawls 1,000 pages/day, Googlebot crawling many dupe content pages may slow down indexing of a large site. In that scenario, do you recommend blocking dupes using robots.txt or is using Meta Robots noindex,nofollow a better alternative? 1,108
29. How active is WMT (Webmaster Tools) monitored by Google, specifically when there is a system error (such as the recent error in reporting the number of pages indexed)? 1,107
30. Now that Google can crawl JavaScript links, what is going to happen with all those paid links that were behind JavaScript code? 1,103
31. Minimizing browser flow 1,102
32. How can Googlebot crawl and index pages that don't have any links to them on my website? 1,097
33. Does Google have any suggestions (or data) on the impact of pipes versus dashes in the title tag? 1,094
34. How would Google consider (and rank) a site that uses meta data and URLs in a language (Italian) and has the H1 of the pages in another (English) considered more appealing for users? 1,093
35. Specifying an image's license using RDFa 1,090
36. How reliable is the '' query in determining the number of pages in the Google index? 1,088
37. Which search media returns more reliable information: Google or Twitter? 1,083
38. How does Google handle ligatures, soft-hyphens, interpuncts and hyphenation points? 1,078
39. Will DiggBar create duplicate content issues? 1,078
40. We still have old content in the index. We block them via robots.txt, use 404 and delete via Webmaster Tools, but Google still keeps it. What can we do to quickly delete content from the index? 1,077
41. Optimizing the order of scripts and styles 1,074
42. AdWords keyword tool gives an estimate of search traffic for a specific (or broad) keyword – How much (%) of this traffic do you believe are search marketers, SEOs, analysts and even business owners etc searching their own targeted keywords? 1,073
43. Say your index page has been cached by Google and then you change the meta description. How long does it take for a Google bot to recrawl that page? 1,073
44. Do dates in the URL of blogs or websites help determine freshness of the content or is it largely ignored? 1,073
45. Google announced page load speed matters for ranking. Should we be doing content-only pages for Google bots? 1,054
46. How not to hide text 1,050
47. How much does the size of a web site (# indexed pages/content) have an effect on its authority in Google's eyes? 1,047
48. If we were to syndicate my written content (entire articles) to multiple domains then would we be able to use the imminent cross-domain <link rel="canonical" tag to confirm which site we would like to index for a given piece of content? 1,037
49. Is it possible to exclude Experts Exchange from search results? 1,036
50. Does PageRank take into account cross-browser compatibility? 1,030
51. How will Google search work with dynamic HTML pages (and I don't mean JSP or other Web 1.0 technologies), like applications that are built with GWT? 1,028
52. Last year, one of my client's web servers when down for over a day. Would this have affected the site's PageRank at all? 1,026
53. Does Google crawl and treat TinyURLs using a 301 redirect the same as other links? 1,025
54. Websites lose backlinks due to other websites going out of business or closing (Geocities, AOL member pages). Does Google remove the back link juice that once came from these pages? 1,025
55. What are Google's plans for indexing the deep web? 1,019
56. A question to non-intended duplicate content: If an online shop can be reached through several TLDs (like .de, .at, .ch) and the only difference is the currency (and necessarily the checkout process) does Google consider this duplicate content? 1,013
57. What is the best way to serve different content according to user country IP (legal reasons)? 1,013
58. I am using a template website (I'm an amateur!). The H1 tag appears below the H2 tag in the code. Does the spider know what's going on? 1,010
59. If a page is disallowed by robots.txt, will a link to this page transfer/leak link juice? 1,005
60. Is there a way to tell Google bots to exclude recurring words on a website such as "leave comment" or "print page" when indexing in order to improve the keyword density? 1,001
61. I noticed that, for example, "Texas widget", and "widget Texas" return different results. I think the gist is the same but the results were different. I'd like to include both terms/phrases on my page but wouldn't that be considered keyword spamming? 1,000
62. What is the benefit of using the Change of Address tool in Google Webmaster Tools, compared to just setting up the required 301 redirections to the new site? 989
63. If I externalize all CSS style definitions and JavaScript scripts and disallow all user agents from accessing these external files (via robots.txt), would this cause problems for Googlebot? 989
64. Following your interview with Eric Engel – you mention about "If Modified-Since." We worked on many websites whereby the actual file timestamp doesn't change but the content does as the pages are database-driven. How should we deal with such situations. 985
65. Are you ever going to do 'weather reports' like Yahoo! does algorithm updates? 981
66. I have a server-side script that automatically redirects visitors to a mobile version of a site if they are using a mobile browser. My question is: What are some things to watch out for (if any) when serving different content based on the visitor? 971
67. Does using a class or an id in a header tag: <h1 id="whatever">text</h1> instead of plain headers: <h1>text</h1> interfere with the way search engines see and understand headings? 959
68. Can moving my website to 'the cloud' harm my listings? 958
69. Can I use robots.txt to optimize Googlebot's crawl? 957
70. Can we feed Googlebot a version of a page that does not contain any advertising code (JavaScript or otherwise)? 946
71. How does Google calculate site load times in the data it exposes in Google's webmaster statistics? 944
72. What is the best way to deal with BIG sitemaps.xml (e.g. more than 1,000,000 pages)? 944
73. Are Chrome's 'usage statistics' used in evaluating site speed? 944
74. Will the new canonical tag help with issues where you, by accident (stupid editors linking to wrong addresses) have indexed sites by the IP address rather than hostname? 932
75. "Real-time indexation" on Google, when we use; is this a possibility in the near future? 931
76. Does Googlebot use inference when spidering – having crawled and /page2.htm, can it guess at the existence of a /page3.htm and crawl it? 925
77. What is the nofollow equivalent for JavaScript links/redirections (now that you follow those too)? 901
78. How many bots/spiders does Google currently have crawling the web? 898
79. Any reason why Google search does not treat the @ symbol differently given the rise of Twitter? 898
80. Is there a good way to kick off a feed in Google Reader by doing something like temporarily making the feed include a whole bunch of old content? 890
81. We work on a well established website. Mobile web seems to becoming more and more popular – should we create a mobile version of this site? 885
82. On a web retail site, unique item descriptions are ideal for both users and Googlebot, compared to generic manufacturer descriptions. Some users prefer to see generic descriptions, too. Will including both reduce significance of the unique content? 862
83. I hate IE6! How would you propose we rid the internet of this outdated browser? 857

