Don’t let Google find your secrets

Google is one of the best hacking tools out there. It may sound incredible, but relatively simple searches in Google and other search engines can dig out sensitive or even dangerous information about your site, your servers and your company.

You want Google to index your site and make you visible and searchable. That part is good for you. But if you have been careless, Google can also index more sensitive information that was never meant to be public, and can therefore be a useful tool for hackers if they want to probe your site for vulnerabilities. This is often called Google hacking. (To be fair, other search engines can be used as well, so it could just as well be called search engine hacking.)

There is a lot of information that can be found on Google thanks to careless (or clueless) administration of websites:

* Various usernames and passwords (both encrypted and in plain text)
* Internal documents
* Internal site statistics
* Intranet access
* Database access
* Mail server access
* And much, much more

Needless to say, a lot of this information can be used as a starting point for breaking into your systems.
Some examples

There are a huge number of different search strings for sensitive information that have been published around the web. A collection exists at the Google Hacking Database (it currently has 1,423 entries), which is where we found the two following examples.

The image below shows the result of two different searches, one for SQL insert statements that use encryption functions for passwords, and another one for INC files with PHP code in them that contain unencrypted user names, passwords and addresses to the corresponding databases.

Both these searches show a number of results where passwords and usernames have been indexed and cached by Google.

Results of google hacking

Not all of the results will be relevant, but this is just the tip of the iceberg.
How common is it?

We searched for “google hack” in Google Trends, and as you can see in the graph, that search term is becoming increasingly popular.

Google Trends for Google hack

This isn’t all that scientific, but it could give a hint that this kind of information gathering is increasing.

On a small side note, it is also interesting to see that a lot of the searches seem to be coming from Asia, and especially South-East Asia. The top countries for this search term are Indonesia, Vietnam, Malaysia and the Philippines.
Is Google doing anything to prevent this?

Google is not standing idly by. The company seems to proactively try to block some usage of these “hacker searches”, at least via direct search links (which can easily be used by automated applications). Some of the queries we tried gave us this result:

Google 403 forbidden
What can you do to prevent “Google hacks” on your site?

There is no way we can list every single security measure out there, but we hope you will find this to be a useful starting point.

* First, if at all possible, keep all your sensitive information off the internet. I.e. on storage that isn’t even connected to the internet.
* Be careful how you write your scripts and access your databases. There are numerous examples where a database access error text shows up on a website and contains way too much information. If you are unlucky and have Google crawl your site at that time, the information is public (and cached by Google).
* Use robots.txt to let Google know what parts of your website it is ok to index. However, note that even this information can be used by hackers, if you for example specify which parts of the website are “off bounds”, the curious will of course look there to see what is so sensitive if they are targeting your site specifically. And of course, don’t forget that if someone wants to scan your website themselves, without the help of Google, they won’t care about robots.txt and other things that prevent nicely behaved online search robots like Googlebot.
* Make sure that the directory rights on your web server are in order, i.e. only allow public access to the bare minimum of directories that are necessary for your site to function. This is a precaution that isn’t really specific for Google hacking, but is worth mentioning.
* Monitor your site for common errors. You can set up monitoring (for example with Pingdom, or another website monitoring service) that checks for text that should not exist on the page, for example part of a php script error message. Then you will know right away if and when your site has “messed up”, and can take the necessary precautions (changing passwords or whatever else may be suitable).
* “Google hack” your own website. Try out the various searches listed in the Google Hacking Database on your own site.

Find out more

* Old but good article explaining search engine hacking.
* The Google Hacking Database has more Google search queries for sensitive information than you could ever imagine.
* The Google Hack Honeypot, which tries to find new searches that are being used.
* More about robots.txt. The always-useful Wikipedia has a good entry about robots.txt as well.

BACK TO BASICS: Researching with Search Engine Operators

BACK TO BASICS: Researching with Search Engine Operators

by: Eileen Kowalski, October 2004

Researching a search engine algorithm is kind of like atomic research. The search engine algorithm is a black box that you cannot see inside of, so you can only formulate theories about a specific algorithm by testing its behavior. This is why it is so important to know all of the different search engine operators and how they work. In this article, we will investigate search engine operators and how you can use them to enhance your search engine optimization efforts.
Boolean Search Operators

Boolean operators are really useful for investigating how your competition uses keywords on the page. The basic Boolean search operators are AND, OR and NOT:

* AND or + yields results that contain both of the keywords on the page. Since all of the major engines automatically search for all of the words you enter into the search field, the AND operator is usually unnecessary. If you want to find pages that contain keywords as a phrase, you should put quotation marks around your keywords ("keyword1 keyword2").
* OR yields results that contain at least one of the keywords on the page. The OR operator is good to use when you are searching for more than one keyword and searching without an operator yields no results.
* NOT or − yields results that contain one keyword but exclude the other keyword. This operator is especially useful to see which competitors are not using specific keyword combinations which you can then target.

Many search engines have included AND & NOT on their lists of stop words, or words that are excluded from your search. We recommend using +, -, OR and quotation marks to investigate specific keyword combinations in Google, Yahoo, MSN and Gigablast.
Advanced Search Engine Operators

However, advanced search operators are what really allow you to dig deeper into specific aspects of each search engine’s algorithm. For example, if you are focusing on your link campaign, competitive research using the link: operator is essential. At SEOToolSet, we have aggregated all of the data that we could find on search engine operators and checked every operator to make sure it still works. We have also included some ideas on how to use these operators in your SEO efforts. Remember that these are just suggestions and are not the official use of any of the operators below. The table of search engine operators listed below is current as of October 2004. We will do our best to keep it updated, especially as a whole new range of operators come out with MSN (hopefully). If you know of any operators that should be on this list, please e–mail them to ekowalski@bruceclay.com and she will add them to this page.
Google Yahoo MSN Gigablast Search Terms Search Results SEO Applications
site: site: or domain: or hostname: site: site: url All pages within a particular domain and all its subdomains if you search without the www. To see how many webpages the search engine has indexed.
cache: url The cached version of a page. To see the search engine’s cache of a particular page.
info: or id: url Links to the cached webpage, inbound links to that page, related webpages and pages that contain the webpage URL. To see several types of information that Google has about a page.
url: url: url The listing for that specific URL in the index. (Note: You must type in http://.) To see if a particular URL is indexed.
ip: 123.12.123 keyword Pages that contain your keyword inside an IP range. To see if duplicate content sites or spam sites exist in the same IP range as your client’s site.
related: url Pages that are "similar" to a specified URL To see a small indication of what Google considers to be "related" content.
link: link: link: link: url Pages that link to a particular URL (Note: In yahoo, must use http://.) To see the number and quality of inbound links to your client’s site and their competitor’s sites.
−link: Pages that contain your keyword and do not link to the specified URL. To identify relevant sites that do not link to your competitor’s site.
linkdomain: url Pages that link to a particular domain (Note: Do not type in http://.) To see the number and quality of inbound links to your whole website in Yahoo.
filetype: or ext: originurlextension: type: filetype keyword Pages of a specific filetype that contain your keyword. To analyze your competition’s use of PDFs, Flash files, Word documents or Excel files.
define: define: keyword Pages that contain your keyword inside definition tags. To analyze your competition’s use of definition lists.
allintitle: keyword keyword Pages that contain all your keywords in their title tag. To analyze your competition’s title tags.
intitle: intitle: or title: title: keyword Pages that contain your keyword in their title tag. To analyze your competition’s title tags.
allinurl: keyword keyword Pages that contain all your keywords in their url. To analyze your competition’s urls.
inurl: inurl: suburl: keyword Pages that contain your keyword in their url. To analyze your competition’s urls.
originurlpath: keyword Pages that have the keyword inside their directory names. To analyze your competition’s directory names.
allinanchor: keyword keyword Pages that have inbound links that contain your all your keywords. To analyze your competition’s inbound anchor text.
inanchor: keyword Pages that have inbound links that contain your keyword. To analyze your competition’s inbound anchor text.
allintext: keyword keyword Pages that have body text that contains your all your keywords. To analyze your competition’s body text.
intext: keyword Pages that have text that contains your keyword. To analyze your competition’s body text.
stem keyword Pages that contain the keyword and the keyword with different endings. To analyze your competition’s use of stemming.
∼ keyword Pages that contain the keyword and synonyms for the keyword. To analyze your competition’s use of synonyms.

Plus, Yahoo has additional feature: and region: operators that let you search for a keyword within a website with certain features or inside a specific region.
Yahoo Search Terms Search Results
feature:acrobat keyword Pages that contain your keyword and links to Adobe Acrobat files.
feature:applet keyword Pages that contain your keyword and embedded Java applets.
feature:activex keyword Pages that contain your keyword and ActiveX controls or layouts
feature:audio keyword Pages that contain your keyword and links to audio files.
feature:flash keyword Pages that contain your keyword and Flash files or links to Flash files.
feature:form keyword Pages that contain your keyword and use forms.
feature:frame keyword Pages that contain your keyword and use frames.
feature:homepage keyword Pages that contain your keyword and are seen as personal pages because they use a tilde ~ in their directory structure.
feature:image keyword Pages that contain your keyword and gif, jpg and other image files.
feature:javascript keyword Pages that contain your keyword and JavaScript.
feature:index keyword Home pages that contain your keyword.
feature:meta keyword Pages that contain your keyword and meta tags. (Note: This does not show if sites have the keyword in their meta tags.)
feature:script keyword Pages that contain your keyword and embedded scripts.
feature:shockwave keyword Pages that contain your keyword and links to or has embedded shockwave files.
feature:table keyword Pages that contain your keyword and tables.
feature:video keyword Pages that contain your keyword and links to or embedded video files.
feature:vrml keyword Pages that contain your keyword and link to VRML files.

Yahoo Search Terms Search Results
region:africa keyword Pages that contain your keyword with African country extensions.
region:asia keyword Pages that contain your keyword with Asian country extensions.
region:centralamerica keyword Pages that contain your keyword with Central American country extensions.
region:downunder keyword Pages that contain your keyword with Australian, New Zealand and other country extensions.
region:europe keyword Pages that contain your keyword with European country extensions.
region:mediterranean keyword Pages that contain your keyword with Mediterranean country extensions.
region:mideast keyword Pages that contain your keyword with Mideastern country extensions.
region:northamerica keyword Pages that contain your keyword with North American country extensions.
region:southamerica keyword Pages that contain your keyword with South American country extensions.
region:southeastasia keyword Pages that contain your keyword with Southeast Asian country extensions.

Official Search Engine Operator Pages

Most search engines list advanced operators on their websites, just not to the extent that we have listed them above. If you have have never worked with advanced operators before, you should definitely check out:

* Yahoo’s search operators
* Google’s advanced search operators
* and Gigablast’s advanced search options

for examples and searching tips directly from the search engines.

The ultimate guide to advanced searching within Yahoo, Google and MSN

Search Engines. You got to love ‘em! The time they save us from having to search through various books, magazines, newspapers, media guides, etc. They have blessed us with more time to be lethargic and lazy in front of our flat screen computer monitors, but that’s another post all in itself.

Have you ever taken the time to think about a Search Engine’s Query? Is there an easy way to monitor links to your site through these queries? How advanced can “searching” really get? Within this post I will show how Google, Yahoo and MSN have created “shortcuts” for their Search Engines.

Let’s first take a look at Google, and their advanced search engine query commands;

Allintext: If you begin your query with allintext , Google confines the search results to pages including all the query terms you have specified in the text of your page. For example, [ allintext: sports entertainment lounge ] the query would pull only pages in which words “sports,” “entertainment,” and “lounge” appeared in the text of the page.

Allinachor: If you begin your query with allinanchor, Gogle confines the search results to pages including all query terms you specify in the anchor text on links to the page. For example, [ allinanchor: historic restaurants Italy ] the query would pull only pages in which the anchor text on links to the pages contain the words “historic,” “restaurants, “ and “
Italy.” Anchor text is the text on a page that is linked to another web page or a different place on the current page.

Cache: If you add other words in your query, Google will highlight those words within the cached document. For instance, [ cache:www.ggogle.com web ] will show the cached content with the word “web” highlighted. This functionally is also accessible by clicking on the “Cached” link on Google’s main results page. The query [ cache: ] will show the version of the web page that Google has in its cache.

Link: This will list webpage’s that have links to the specified webpage. Back links. For instance, [ link:www.google.com ] will list webpage’s that have links pointing to the Google homepage. Note there can be no space between the “link:” and the web page url. (also an advanced search operator within MSN and Yahoo!)

Site: This will restrict your search results to the site or domain you specify. Example, if you enter [ peace site:gov ] you will find pages about peace within the .gov domain will come up. You can specify a domain with or without a period, e.g., either as .gov or gov. (also an advanced search operator within MSN and Yahoo!)

Allintitle: Google will bring up all results containing all the query terms you specify in the title. For example, [ allintitle: sports trivia ] this will pull up only documents that contain the words “sports” and “trivia” in the title.

Allinurl: Will pull up all specified terms within the URL. For example, [allinurl:google faq] will return only documents that contain the words “google” and “faq” in the URL, such as www.google.com/help/faq.html.

Author: If you begin your query with author, Google will restrict your Google Groups results to include newsgroup articles by the author you specify. The author can be a full or partial name or email address. Here is an example, [Pet
Cemetery author:Steven King ], this will return articles that contain the word “Pet Cemetery” written by Steven King.

Define: If you begin your query search with define, this will show definitions from pages on the web for the term that you specify. An example, [ define: football ] this will pull definitions for “football.” (also an advanced search operator within Yahoo!)

Filetype: When you add filetype in the query search box, this will bring up the result pages whose names end in the specified suffix you have typed. For example, [ web page evaluation checklist filetype:pdf ]. This will return Adobe Acrobat pdf files that match the terms “web,” “page,” “evolution,” and “checklist.” (also an advanced search operator within MSN and Yahoo!)

Group: By typing group operator, Google will restrict your Google Groups results to newsgroup articles from certain groups or sub areas. Example, [ dream group:misc.adults.moderated ] this will return articles in the group misc.adults.moderated that contain the word “dream” and [dream group:misc.adults] will return articles in the sub area misc.adults that contain the word “dream.”

Info: If you enter info: specific URL this will present you with some information about the corresponding web page. Example, [ info:gothotel.com ], this will show information about the national directory GotHotel.com home page.

Insubject: By entering insubject with the search query, Google will restrict articles in Google Groups to those that contain the terms you specify in the subject. For example, [ insubject:”can’t sleep” ] this will return any Google Group articles that contain the phrase “can’t sleep” in the subject.

Intext: This will pull results with documents containing your specific term in the text. For instance, [ intext:phenomenon ] this will return documents that mention the word “phenomenon” in the text.

Putting intext: in front of every word in your query is equivalent to putting allintext: t the front of your query. Example, [ intext:modern intext:artists ], is the same as [ allintext:modern artists ]

Intitle: By doing this, you will be pulling documents containing your specific term in the title. Example, [ spider bite intitle:symptoms ], this will bring up documents that mention the word “symptoms” in their titles, and mention “spider” and “bite” anywhere in the document (title or not). (also an advanced search operator within MSN and Yahoo!)

Inurl: Including inurl: within the search box will pull documents containing your specified term within the URL. Example, [ inurl:print site:www.googleguide.com ] searches for pages on Google Guide in which the URL contains the word “print.” It finds pdf files that are in the directory or folder named “print” on the Google Guide website. (also an advanced search operator within MSN and Yahoo!)

Location: By placing location: within your search query, this will pull only articles from the location you specify will be returned. Example, [ magazine location:Los Angeles ], this will bring up articles that match the term “magazine” from sites in Los Angeles. (also an advanced search operator within MSN)

Movie: Entering in movie: will pull movie related information. This function is more of a random search operator, but still can be useful in its own way.

Phonebook: Entering in phonebook will grab all U.S. white page listings for your selected query term. Example, [ phonebook:Starbucks Riverside ] this will pull all phonebook listings of “Starbucks” in “Riverside”.

Rphonebook: This will pull U.S. residential white page listings for the selected keyword(s) you have specified. Example, [ rphonebook:Rachael Smith Los Angeles ] this will pull the phonebook listings for Rachael Smith in Los Angeles (city or state). Abbreviations like [ rphonebook:Rachael Smith LA ] also work.

Related: Placing related:url, within the search query will list web pages that are similar to the web pages you specified. Example, [ related:www.basketballnews.com ] will list web pages that are similar to the Basketball News homepage.

Source: Placing source: within your search, this will pull articles from the news source with the ID you specify. For example, [ darfur source: Los Angeles Times ], this will return articles with the word “darfur” that appear in the Los Angeles Times. To find a news source ID, enter a query that includes a term and the names of the publication you’re seeking.

Stocks: Placing stocks: within the search box, Google will interpret the rest of the query terms as NYSE, NASDAQ, AMEX, or mutual fund stock ticker symbols, and will open a page showing stock information for the symbols you specify. Example, [ stocks: ebay.o ] this will give you information about Ebay Inc.

Store: Typing store: within your search box, Froogle will pull information of the store ID you specify. Example, [dress shirts store: Nordstrom’s] will return listings that match the terms “dress” and “shirts” from the store Nordstrom’s.

Weather: Placing weather: and the city or location name, if recognized, will place the forecast at the top of the result pages. Your results will usually include links to sites with the weather conditions and forecast for that location. There is no need to include a colon after the word. Example, [ weather Whittier CA ], this will return the weather for Whittier, California and [weather+ zip code] will pull information regarding the weather for that specific zip code.

Let’s take a look at what Yahoo’s search operators. Ones that Google might not have;

Hostname: By adding hostname: within your Yahoo search query, this will allow you to find all documents from a particular host only. Example, [ hostname: computers.yahoo.com ]

Domain: Placing domain: within your Yahoo search query will pull all pages within a particular domain and all its sub domains if you search without www. Example, [ domain: computers.yahoo.com ]

Originurlextension: By placing originurlextension: within your Yahoo search query, this will pull all pages from a specific filetype containing your specific keyword.

Orignurlpath: By placing orignurlpath: within your Yahoo search query, this will pull all pages that have the keyword inside their directory names. This is used to mainly analyze the directory names of your competitors.

Stem: By adding stem inside of your Yahoo search query, this will pull all pages containing your specific keyword or keywords with different endings. This is used to analyze your competition’s use of stemming.

Linkdomain: By adding linkdomain: inside of your Yahoo search query, his will pull all pages that link to a particular domain. (also an advanced search operator within MSN)

Yahoo has some additional features that you might like as well;

All of these words: This action will include all of the words that you specify within your search. Comparable to placing “AND” between your specified words, “+” in front of your specified word. Example, if you are looking for “Brea Sports Bars.” This will pull all pages with “all of these words” within them.

At least one of these words: This will include all matches with either one or more of your specific words within their pages. This is just like placing “OR” between your specified words. Example, if you need to read up on either “hats or beanies.”

The exact phrase: By placing this with your search query, this will pull all pages that have the exact phrase that you have placed within your search box. This is just like placing quotes (“ “) around a group of specific words. Example, if you are looking for a specific phrase to a movie: “This is Sparta!”

None of these words: This will limit pages from your search by not pulling pages that contain your specified words within them. This is just like placing “NOT” between your specific words or “-“ before your specified words. Example, you are looking for information about baseball in the “all of these words” query, but baseball bats within the “none of these words” query. Feature:acrobat + your specific keyword, this will pull all pages that contain your keyword and links to Adobe Acrobat files.

Feature:applet + your specific keyword this will pull all pages that contain your keyword and embedded Java applets.

Feature:activex + your specific keyword this will pull all pages that contain your keyword and ActiveX controls or layouts.

Feature:audio + your specific keyword this will pull all pages that contain your keyword and links to audio files.

Feature:flash + your specific keyword this will pull all pages that contain your keyword and Flash files or links to Flash files.

Feature:form + your specific keyword this will pull all pages that contain your keyword and use forms.

Feature:frame + your specific keyword this will pull all pages that contain your keyword and use frames.

Feature:hompage + your specific keyword this will pull all pages that contain your keyword and are seen as personal pages because they use a title in their directory structure.

Feature:image + your specific keyword this will pull all pages that contain your keyword and gif, jpg and other image files.

Feature:javascript + your specific keyword this will pull all pages that contain your keyword and Javascript.

Feature:index + your specific keyword this will pull all pages that contain your keyword.

Feature:meta + your specific keyword this will pull all pages that contain your keyword and meta tags.

Feature:script + your specific keyword this will pull all pages that contain your keyword and embedded scripts.

Feature:shockwave + your specific keyword this will pull all pages that contain your keyword and links to or has embedded shockwave files.

Feature:table + your specific keyword this will pull all pages that contain your keyword and tables.

Feature:video + your specific keyword this will pull all pages that contain your keyword and links to or embedded video files.

Feature:vrml + your specific keyword this will pull all pages that contain your keyword and links to VRML files.

There is also a “Shortcuts” section within Yahoo!, so if you are really interested in learning about all of your advanced searching options, this will help you in your quest for additional information.

Also, some people ask “What is the differance between Yahoo’s Link: and LinkDomain: search commands? Link: is a command that will find inbound links that link to a specific URL. LinkDomain: is a command that will find inbound links that link to a domain. What is expected is that an entire website would have more inbound links than a single webpage, hence the LinkDomain: command will be numerically greater than the Link: command.

See Google’s Advanced Search Operators above for additionl search commands that are the same within Yahoo!

And finally, here are some Live Search advanced search operators;

Contains: By adding contains: within your search query, this will pull all pages that have links to the file type that you specify. Example, [ music contains: wma ] this will pull pages containing links to WMA.

IP: By adding IP: within your search query, this will find all sites that are hosted by a specific IP address. The IP address must be a dotted quad address. Example, [ IP: 123.45.678.901 ]

Language: This will pull all pages with specified languages. Specify the language directly after the language:keyword. Example, [ language:en ] this will allow you to see only web pages in English.

Prefer: This adds emphasis on either a word or another operator. Example, [ baseball prefer: club ]

Feed: By typing feed: in your search query, this will pull RSS or Atom feeds on a website. Example, [ site:www.latimes.com feed:www.latimes.com ]

Hasfeed: By placing hasfeed: within your MSN search query, this will pull up documnts that withhold an RSS or Atom feed on a site.

See Google and Yahoo! Advanced Search Operators above for additionl search commands that are the same in MSN.

After all that has been placed here, Search Operators are there to make our specific searches a little bit easier. So just remember, anything information you need to find, it is only an Advanced Search away!

Download the Ultimate Guide to Advanced Searching here.

Popularity: 100% [?]

Google Hacks for Dorks and SEO prowlers

Google Hacks… or more aptly Google Dorks are a handy tool for anyone that not only enjoys SEO, but searching in general. Originally termed as such by the hacker/cracker community – you can get lot’s of interesting information. They call ‘em dorks because if you’re leaving information open to the search engines that shouldn’t be… then yer a dork!

And I figure if Google is allowed to read certain files and feels like serving related data back to me…then great! There should be no reason for me not to play with them and entertain you, titillate and hopefully even educate. Oh, and yea… we’ll get to some SEO stuff too…but later.
Google Advanced Operators

As any good search geek knows, the advanced operators are a great way to mine for a variety of data. You know the ones, site: link: intitle: and the rest of the family. But let’s look at how they got that name of ‘dorks’ just to get the idea… of some dorky info floating around at Google.

Let’s have some fun with a few shall we? First let’s go looking for some sensitive data via Robots.txt. Now I am not going to show you any dirty laundry you cheeky monkey, but if one spent enough time (and there are those that do) often sensitive info is thought to be invisible by webmasters with this little command; the Disallow

"robots.txt" "Disallow:" filetype:txt

or even…

"robots.txt" "Disallow:" "private" filetype:txt

Which can always be fun for an evenings reading…. Obviously you can play with keywords and get inventive. But that’s not really a dork since robots files are publicly available… ok so let’s move along…


Getting Sensitive

"not for public release" –
This is an oldie but a goodie and one can certainly play with it as well by looking closer at some .edu - .gov or .mil TLDs as well. For example;

"not for public release" inurl:edu

Or how about;

"not for distribution" confidential filetype:pdf

This will tighten it up to only showing PDFs which I find to be ever so much more helpful. And let’s say for fun we’re in the travel sector looking for some good tidbits for link bait of other general business intelligence we add our KW too the mix;

"not for distribution" confidential, travel, filetype:pdf

You get the idea here…. all I can say is that one can start to apply the concepts behind these hacks to find all kinds of interesting reading material. And if you’re a reporter…well, I am sure your nose is tingling at the moment.



Robots.txt (aka business intelligence)

Let’s say we’re working the ‘florida’ market and wanted to see what other sites in the space are up to we could use;

"robots.txt" "Disallow:" filetype:txt …or even better - (inurl:"robot.txt" | inurl:"robots.txt" ) intext:disallow filetype:txt and "robots.txt" "Disallow:" filetype:txt inurl:florida

What any of that information is.. or how it can be used, I leave to your imagination – I’m just sayin’…



AW Stats (aka Keyword Research)

Sticking with our Florida theme, now go looking for some stats from .edu domains with ‘florida’ related…
florida intitle:"statistics of" "advanced web statistics"

Maybe we’re only interested in some .edu domains?
florida intitle:"statistics of" "advanced web statistics" inurl:.edu

Or maybe we want to see what keyphrases are being used to find .edu sites;
keyphrases intitle:"statistics of" "advanced web statistics" inurl:.edu

Webalizer; and of course we can also do the same with Webalizer (or other popular program)
intitle:"Usage Statistics for" "Generated by Webalizer"

and the ‘florida’ niche with these
intitle:"Usage Statistics for" "Generated by Webalizer" inurl:florida

or….

florida intitle:"Usage Statistics for" "Generated by Webalizer"

You could even search images - inurl:/webalizer/ intitle:usage statistics + hosting

You get the idea… play with it to find more goodies. If these dorks want to leave me research data to mine for KWs and so forth…what am I to do? I merely asked Google questions and went for a random walk.


And what can you use it for?

I say there is no end to the information both educational and entertaining out there thanks to the dorks and Google. Some of the more interesting uses I have found are;

* KW research
* Link Building
* Content creation
* Competitive intelligence
* Nefarious things (for you A types)

And I am being tame with the examples… so one wonders, are we dorks?

During the research and many hours playing around I have found the deeper darker side and what I have posted here merely scratches the surface as far as nefarious ways to use them. Giving pause, the consideration of ranting about Google’s (and other search engines) enabling of this misses the fact we are dorks. Through laziness or lack of foresight we often leave things in public as much as leaving our open laptop unaccompanied in the park in summer. Don’t be a dork

I like to use them to find things like lists of directories and other reports to see what others are up to; directory filetype:xls inurl:SEO OR report filetype:xls inurl:SEO - that time looking for XLS files…

Link Builders dream…

Maybe you’re a happy little link builder that is looking for some nice spots to drop your legitimate/spammy links. Let’s try this;

add-links, last-updated 2000 inurl:.edu

Using advanced search operators such as we did with the Yahoo Site Explorer is another great way to track down opportunities for the fastidious link builder. First off let’s use the ‘linkdomain’ operator

linkdomain:huomah.com site:.com "SEO Blog"

1. linkdomain: – searches for links to Huomah.com
2. Site; - tells it to look for results from ‘.com’ extensions.
3. “SEO Blog” searches the KWs on the page (or hopefully in the anchor text)



That’s the basics to give you the idea… now we’ll step it up some.

We’re looking for target pages where there is a link to the site (my blog again) and has the target term we’re after. This is by no means full-proof and does require some leg work, but it will make the targeting of relevant themes in your linking somewhat easier.

We can also do the same for .edu or .gov websites, which are perceived to be more valuable as trusted sources of search engines – we’d do so as such;

1. linkdomain:example.com site:.edu "keyword"
2. linkdomain:example.com site:.gov " keyword"

…. Play with them… always some goodies to be found. We’re getting warmer….

Now let’s look at another route, which is to look at the linking sites and associated page titles. Considering the theme of the page is important to the value of the link, pages with related keywords in the page title are of interest to us. So for the keyword SEO (researching my blog as a competitor) I could do something like this;

linkdomain:huomah.com -huomah.com intitle:SEO

And when we have multiple terms such as; ‘search engine optimization’ we would use quotations;

linkdomain:huomah.com -huomah.com intitle:"search engine optimization"

Once more, we can also use Inurl: which looks for the keyword(s) in the url from linking pages; another reasonably strong ranking signal.

linkdomain:huomah.com -huomah.com inurl:"search engine optimization"

I advise playing around to find other angles which these can be used. This is a great method ( allintitle: and allinurl: for Google - whose link data sucks)
Don’t be a Dork

There are as many ways to utilize them as the imagination will allow. Advanced search operators are one of the greatest tools for the SEO practitioner; and hackers alike. Understanding not only how to use them, but how to protect against them (from a hackers viewpoint) is huge. If you want to learn more there is some further reading below;

Google HACKS – more reading

the Google Hacking DataBase - I Hack Stuff
Google Hacking Not Fun For You - WebPro News
Advanced operators reference guide – Google Guide
Advanced search operators – Van SEO Design
The ultimate guide to advanced operators - Hybrid SEM

How to Find SEO Competitor Keywords, Social Media & Backlinks

* How to evaluate your keyword competition;
* Tools for spying on competitors search marketing tactics;
* How to select your competitors; etc

Today I am going to focus on what you can learn from your competitors if you are smart enough.

First of all, a few things to take note of:

1. Your established competitors, who have been in play long enough, have probably come across numerous pitfalls and learned how to cope with them;
2. The fact that your competitor has been exploring the niche much longer than you doesn’t mean he is now doing everything right;
3. Promoting a site without proper competition research means to promote it blind;
4. By merely copying your competitor, you will never be able to surpass him;
5. If you focus on finding what your competitor is doing profoundly wrong, you have good chances to get ahead of him.

Keeping all that in mind, let’s see what you actually can learn from your competitors.
The Keywords

Keyword research is both difficult and tricky. The only way to effectively refine your list is to test it in practice (PPC campaign may help you with that).

Your competitor may have already tested the keywords and chosen the best ones that both generated good traffic and converted. I do not suggest relying on his keywords completely. But if you take time analyzing and comparing several of your competitors’ on-page and off-page keyword targeting, you can make your list much better.

How can you do that?

* here are a few (both free and paid) tools to help you identify what your competitor is already ranking for;
* don’t forget to check your competitor’s PPC keywords;
* use keyword prominence tools (to analyze which terms the site is optimized for);
* analyze your competitors’ website visitors behavior (“Keyword Engagement” and “Keyword Effectiveness”).

Quick tip: don’t forget to compare which terms your competitor tried to rank (i.e. which terms he is using throughout the site) and which terms he ended up ranked for. Thus, you will be able to do better than him.
The Backlinks

Those who linked to your competitor, will most probably want to link to you. Again, the key here is not to copy step by step but to do better. So:

* find your youngest and already successful competitors to spot newest tactics (those that are still valid and work the best);
* here are the best ways to explore your competitors in Yahoo (and beyond);
* also check some backlink checkers that help to organize and categorize your competitors’ promoters.

The Social Media

Check where your competitor has found his topical community. Make sure to analyze how the social media users react to your competitor, their feedback and comments, what they like and dislike.

* search for his [brand name] throughout forums and social communities;
* analyze what your competitor is associated with: which terms his site is tagged with.

The SEO Career Kickstart Guide : How to get a Job in SEO

You might not have dreamt about a job in SEO when you were a kid, but you can make it pretty cool if you really want to. Everyone I know in SEO loves SEO and their job, and if they’re not all that happy with who they’re working for, they won’t have any problems moving to another, better paid role whenever they want.

Recruiting an experienced, reliable SEO analyst, manager or consultant is quite a challenge for an employer. Truth is, there’s still a huge gap in the jobs market for candidates with the right CV’s. If you’re a graduate looking to move into online marketing, you could do very well for yourself in SEO. Here’s my guide to getting a job in SEO and making a rockstar career for yourself.

Stage 1 - Pull together some basic skills

If you want to get an edge over the other applicants for your first role there are some skills that will put you leagues ahead of the others. Being able to look at a website and show an understanding of the basic SEO principles that underpin its success (or failure) is a great first step.

As an employer, I’ve found myself drawn to graduate CV’s with words like blog, html, CSS, SEO, Wordpress, Analytics and so on. So, time to brush up on those skills! It’s all very well having the words on your CV but can you demonstrate how you have used them?

So, if you want to be more or less guaranteed to get the interview and sail it, here’s what you should do:

Start a blog in a platform like Wordpress or create your own, basic website. As Danny at SEOmoz puts it:

“Before diving into SEO techniques it is important to know the basics of web development.”

I couldn’t agree more. For me, I would always go with a blog platform like Wordpress, because you get the chance to tweak the site for SEO and write about something you care about at the same time. That said, it really doesn’t hurt to understand the basics of creating a page in HTML, using a CSS stylesheet and FTPing your work to a host site. Ultimately, providing a blog or basic site url on your CV will look really good.

One of the other good reasons to use Wordpress is that it’s really easy to tweak using plugins. Here’s a guide to get you started. One of the first things you should do is set up a Google Analytics account and go get Joost De Valk’s Analytics for Wordpress plugin. Suddenly you’ve got a powerful, free analytics tool that is usable enough to easily teach you some of the basic metrics of website performance. This is important, if you can talk confidently about search engine traffic, bounce rates, keywords and define all of those metrics you see in the Google Analytics dashboard then you’ll be fine.

Stage 2 - Read up on basic SEO and start to apply it

You’ve got a lot of reading to do, but don’t let that put you off! There are a few really good websites that can give you a solid kick start into the industry. One of the places I really learnt about SEO was SEOmoz. The Beginner’s Guide To SEO is still one of the most definitive and complete guides to the fundamental principles of SEO. It’s not a static document either - it has been recently updated as techniques have developed. It’s best to read a few pages a day and try to implement each idea into your new website. For example, after you’ve read the URLs, Titles and Meta Data section in the guide you might want to refer to Yoast.com’s Wordpress SEO guide and read up on how to apply optimised meta titles to your site. That Wordpress guide is extremely useful stuff and working through both will practically give you an understanding of SEO and show you how to apply it. You might also want to try out Aaron Wall’s SEObook, who has a suite of free tools and lots of blog history to catch up on.

If you’re a quick study and you like to read, it might be time to start visiting a few of the better recognised SEO industry websites. I recommend to all beginners that they should start keeping an eye out for the best bloggers and most authoritative sources of SEO news quite early on. It really helps in an interview if you can talk about SEO sites that you visit regularly and explain why you like them. Mentioning one good site is great but knowing a few is really good. If you’re using an email client that has an RSS reader or if you use Google Reader you should definitely consider adding RSS feeds from these sites below:

SearchEngineLand - Search Engine Land: Must Read News About Search Marketing & Search Engines.

SEOmoz - SEOmoz: Read SEOmoz, Rank Better. Great guides and a wealth of thousands of thought provoking blog posts.

SEObook - SEO Book.com is a leading SEO blog by Aaron Wall covering the search space. It offers marketing tips, search analysis, and whatever random rants come to mind.

Search Engine Roundtable - The pulse of the search marketing community.

Matt Cutts - Gadgets, Google, and SEO. Matt has been the head of Google’s webspam team for as long as I can remember. Excellent background reading spanning back a long time.

There are of course, many more recommended sites than this. Check out this post for a great list of bookmarks. If you’re feeling really brave then you could download the OPML file from Toprankblog’s Search Marketing Biglist - be prepared to delete a few though as they’re not all 100% relevant to pure SEO! If you want to download a thinned down version, you can download my OPML file here.

So you’ve read up on SEO and you have experience in applying it to your own website. What next? Links.

Stage 3 - Understand the fundamentals of linkbuilding

Some SEO’s feel that linkbuilding is the hardest part of their job. The best SEO’s I know founded their career in linkbuilding! Being able to discuss how to get links on the internet will really tick some boxes with your interviewer. Try reading this beginners guide to linkbuilding first and then check out the ideas below. You should examine each closely, and try to give a real life example of how you’d apply the technique in your interview:

- Genuinely original, link worthy content - add value for users, answer questions, demonstrate value ad original thinking and you’ll attract links.

- Link bait - hilarious quotes in images of cats looking inquisitive or just plain stupid? Sounds like link bait. For a proper run down of link baiting techniques, read this and this and this and this.

- Article websites - Articlesbase.com for example. Sadly, they’ve started to nofollow links. Touche!

- Directories - debatable value lately especially as Google have removed “directories” from their webmaster guidelines but still, here’s a useful article on the top directories you should submit your blog to.

There are lots of other ways to attract links and there’s lots of really good content on the subject, try downloading “Link building notes of an SEO Kindergartner” from this article.

Stage 4 - Tools and resources for the job

I’ve just started to read through this incredibly detailed list of useful tools for SEO - The Internet Marketing Handbook. It’s an amazingly complete list and I get the feeling I’ll be referring back to it on a regular basis. It’s already added to my favourites!

Stage 5 - Start applying for jobs

Your next move is absolutely vital. Search for a recruitment agency who understand SEO and in this case, graduate recruitment. Agencies like The Graduate Recruitment Company in London or take a look at jobs boards like jobsinsearch.com. Obviously not everyone reading this article will be based in London, but there are lots of recruiters searching for junior candidates with the right skills and attitude.

Speak to every agency you find, and ask questions related to the training and support you’ll receive from each of the potential employers advertising for SEO roles. You should look out for progressive agencies who will look after your training and development and give you all the support you need to succeed. Ask questions about the conferences and training they will allow you to attend and the tools at your disposal to do your job. Listen out for SMX and SES as a yardstick measure of whether you’ll be sent to good events. Obviously there are many more conferences across the world - so take a look at this conference calendar to get an idea of what’s going on and where.

Stage 6 - the Interview

There are lots of websites with example SEO interview questions. Try not to worry, you’re applying for junior roles, remember! As long as you’ve soaked up the ideas above, you’ll nail an interview. That said, here’s a few example questions from me.

- Tell me about your blog. What features are optimised for search engines?

- Explain what factors might influence how a page ranks on a Google search page.

- Tell me how you would get more links to your website (I might ask for a specific example, say a car enthusiast or a recruitment company)

- What metrics might be important to a search engine marketer?

See! Pretty easy if you’ve read all of this. I hope my article has been useful and if you’re considering joining our community then I wish you all the best of luck. It’s a great industry and it can be a great deal of fun. Happy SEOing! ;-)