I post at SearchCommander.com now, and this post was published 16 years 5 months 24 days ago. This industry changes FAST, so blindly following the advice here *may not* be a good idea! If you're at all unsure, feel free to hit me up on Twitter and ask.
When I first wrote my article called a proper foundation for search results back in 2004, I admit that really had no idea that there were so many other factors involved in rankings. That article morphed into an entire SEO 101 section in 2006, but even that isn’t complete, because things are always changing, and there’s always more to learn.
What follows here are 10 more search ranking factors for the intermediate level SEO, and should be easy to understand. However, please leave any questions or comments for clarification, or feel free to offer more input if you like…
11. Robots.txt –
The robots.txt file has become a standard way of giving instructions to the search engines concerning your website, and it’s also an effective way of telling them where your XML site map is. You can read a lot more about the robots file here but the most important thing to remember, is simply that you need to have one.
If you have a robots.txt file, then your site won’t be generating 404 errors every time the search engines look for the file, and fewer 404 errors are a good thing.
A typical robots.txt file will show the spiders the location for the XML site map, and then disallow any directories that the site owner doesn’t want crawled, by either all spiders, or just certain user agents. For example, the following three lines may constitute an entire robots file:
Sitemap: http://www.pdxtc.com/sitemap.xml
User-agent: *
Disallow: /cgi-bin
If you’re not comfortable with the syntax, here is a free tool for generating your own Robots file…
12. Use an XML Sitemap –
The XML site map has become the de facto standard for ensuring all that of your pages get crawled. There are multiple free utilities and websites available for creating your XML site map, like…
http://www.xml-sitemaps.com/
http://www.auditmypc.com/free-sitemap-generator.asp
http://www.sitemapdoc.com/
but you need to keep in mind that this is NOT the same thing as number 7 in my top 10 SEO factors; a static site map.
No human will likely ever look at this file, and instead, you will upload it to your website, and then “tell” the search engines that it exists via your robots.txt file, and by manually submitting it through Google’s Webmaster tools.
13. Permanent 301 Redirects –
The actual file names (pagename.html) should not be changed if it can be avoided, and pages & sections should NEVER be deleted entirely from your website without redirecting spiders and humans to another page on your site.
Missing pages create 404 (page not found) errors, so it’s important to do a permanent redirect (301) to a similar or corresponding area of your website.
For example if you sell widgets, and a particular model of blue widgets is no longer carried, don’t just delete the page, but instead created 301 redirect pointing back to your blue widget category. This retains the value of any inbound links you may have obtained for that product, and prevents pages from showing up in the searc hindex that no longer exist.
14. Avoid 404 errors –
This item is more theoretical than fact, but nowadays, (especially with Google), “trust and credibility” weigh heavily in your ranking factors. Therefore, it’s my belief that the search engines do not look favorably upon sites with continual 404 errors.
If you are constantly changing your website, and visitors or spiders are coming up with 404 errors on a regular basis, then it’s possible (and I believe, likely) that the search engines could see that as “somewhat flaky” and lower your ranking. Again I have no proof here, but to me it’s logical, and I’ve always insisted that 404 errors be taken care of when evaluating a website.
There are tools you can use to find broken links, which are the most common source of 404 errors, and my preferred tool of choice is called Xenu Link Sleuth. You can also look at your own internal stats program to see what 404 errors may be getting generated, and while Google analytics does NOT show you your 404 errors, (how stupid is that?) most likely, your own web hosting internal web statistics do.
It’s important to track down and eliminate your 404 errors by either fixing the broken link, replacing the missing file(s), or by adding a 301 redirect for the missing URL to another relevant area of your site.
15. DocType statement in .html code –
Some people may think this is silly, but I’ve seen it help before. At the very top of your code, I recommend placing a DOCTYPE statement in the HTML code. A DOCTYPE statement in an HTML document declares the document type and level of HTML syntax. While no human visitor will ever see this, the statement is read by Web browsers, software, and even search engine spiders.
Again, this may be more theory than fact, but I can tell you that in my experience, with all other things being equal, a webpage with this statement will outrank a page without it more often than not.
If you care to learn more about the technical aspects of this DOCTYPE statement, read this, but otherwise feel free to “view source” here on my site and see what one looks like.
16. Valid .html code –
Requiring valid HTML code in my opinion is way overrated as a search ranking factor, but it is still worth looking at to catch the biggest errors.
If you really want to drive yourself crazy, and ensure hours and hours of work for your Web developer, then you insist that all of your code is “valid” when run through a validator such as the W3C validator tool.
It has been my experience that minor errors showing up here will not affect your ranking, however, I do believe that it is still a good idea to run your site through this validator, so that you can pick up any glaring or dramatic errors.
Most web developers, when shown the list of errors that will undoubtedly appear, will be able to cherry pick their way through them and fix a large quantity. Sometimes these errors can be major, and fixing them can be done easily, and doing so can improve your search rankings.
Other times these errors can be minor, take hours and hours to track down, and in the end they’ll have no significant impact on your search rankings.
My recommendation is to verify that major errors get cleaned up, and simply ignore the rest, unless your goal is just to keep a web developer busy digging through the code.
17. Clean URL’s –
Try to stay away from using long URL strings that contain ugly characters, and make no sense to humans. While the search engines have gotten far better at spidering and indexing them, people are still wary of clicking on them.
Whenever possible, try to use relevant keywords in your page names and directory names, and separate them by dashes, rather than the underscore character.
There is some evidence that doing so can improve your ranking for individual phrases, but there’s even more evidence showing that you are far more likely to get a click through when someone can tell that the URL is relevant to their search. Furthermore, as people share URLs via emails and bookmarks, “people friendly” URL’s are again, far more likely to get the click.
To illustrate, let me ask you which you of the following two URLs are you more likely to click on, if someone sends you a link, or you see them in a search result?
http://www.domain.com/1743233_uld.asp/cid=7344/?item=4330
or
http://www.domain.com/perfect-shirt-red/
18. Avoid Session ID’s –
While this is related to the previous item, clean URLs, it’s worth mentioning on its own. If you have a website that sells something, it’s likely that your users may be assigned session IDs when they visit for tracking purposes.
A session ID is appended to the URL, and makes a new or temporary set of URLs for the users entire visit. This can be excellent for monitoring site visitor statistics and user behavior, but can be detrimental to your search rankings, if your system is assigning session IDs to the search engine crawlers as well.
You need to ensure that session ID’s are not being assigned to search engine robots that are visiting your website, and to stop it, have your developer read this
19. Site Structure –
Your site should have a logical structure, with individual sections or categories for specific types of information and products. Doing so will allow users to easily determine where they are at any given time, it allows the search engine spiders to see a well-organized site, and it also allows you to develop your site in a “theme based” manner when it comes to adding content and developing inbound links.
A quick example would be that if your website sells cars, you have your top-level pages, including home, about us, contact, new cars, and used cars.
Underneath your new cars and used cars categories, you can break it down into brands, years and models, and as you get to specific cars, you would have photos, spec sheets etc about each given car.
Different search marketers call this type of structure something different, but setting things up this way creates a nice “Pyramid” or “Theme” or “Silo” structure to your website that lends itself to excellent search rankings.
Using a graphic tool for charting your websites structure is an excellent way to visualize what your site structure looks like, and for that I use Mindomo
20. Page file size-
Just like your actual web visitors, the search engines like pages that load quickly, and don’t make them wait for files & graphics to download. Keep your entire page size as small as possible, and I prefer to see most be under 100k total. The faster they load, the better.
Web developers and graphic artists frequently want to impress visitors and clients with their graphics or their flash files, but great care should be taken to make sure that a balance is achieved between how a page looks, and how long it takes to load.
You have to remember that pages are over 150K are often not even fully cached by the search engines, and Google considers page loading time so important that it has now become a factor for Googles Ad Words quality score.
They know that user response to slow loading pages produces a negative experience, so they are actually making you pay more for the slow loading pages, so it’s logical that this would extent to organic serps, as they try to improve the user experience.
To be continued –
This certainly isn’t everything but it’s a good start, so this article will be continued has been continued, with 10 more advanced SEO tips.
Thanks for the great post, Scott. It’s interesting that so many of your intermediate SEO factors come under the venue of the hosting techs. We constantly struggle with making them aware of how important what they do down in that dark basement is to SEO.
thanks Mary, and yep, I’m sure you agree that a Web host can be very important to the success or failure of any search campaign…
I agree about the XML sitemap, but have you noticed that there’s some gathering debate about whether using xml-sitemaps.com can actually hurt your SEO? I am not sold, but it’s definitely gathering steam…
Thanks Paul, for commenting. No, actually, I’d not heard anything about that theory beyond the similar ired rantings against using Google analytics. I can’t imagine there’s an validity to it, but would love to read about it… Got a couple of links?
Scott, great post. The tip on file size is great. Great jump from Adwords to organic SERPS. How effective is “Siloing”? I have heard some SEO’s say that it is the most effective SEO strategy out there and can cause a site with 100 in bound links to compete with sites with 10,000 in bound links. Before I saw this article, I had already read one of Bruce Clay’s articles on the subject this morning which I found very interesting.
Thanks Jason –
Well, you can call it what you want, but I think “Siloing”, or “theming” is just good website structure, and I think it’s hugely effective.
Of course it’s better to design a site like that in the first place, because going back in and restructuring an entirely linear site can be tedious involving lots of 301 redirection, but in the end I do think it’s worth it.
(yes Bruce Clay has a great series, I think it’s 4 parts, on “siloing”)
Scott,
When theming with wordpress, do you use posts or pages? Each page is going to link to the top of all the categories or pages. Does this negatively affect theming?
I’m not talking about posts, just about pages and categories. Yes most themes in WordPress will just link to everything unless modified.
While that won’t have a “negative” affect, you can do more by creating multiple versions of certain files, (like the footer for example), to do some PageRank sculpting using the nofollow tag.
Great tips Scott. For page file size, people can even reduce the sizes by using html code cleaners for further benefits like getting more visitors.