Google hacking is the use of a search engine, such as Google, to locate a security vulnerability on the Internet. There are generally two types of vulnerabilities to be found on the Web: software vulnerabilities and misconfigurations.
Although there are some sophisticated intruders who target a specific system and try to discover vulnerabilities that will allow them access, the vast majority of intruders start out with a specific software vulnerability or common user misconfiguration that they already know how to exploit, and simply try to find or scan for systems that have this vulnerability. Google is of limited use to the first attacker, but invaluable to the second.
How can you prevent Google hacking?
Make sure you are comfortable with sharing everything in your public Web folder with the whole world, because Google will share it, whether you like it or not. Also, in order to prevent attackers from easily figuring out what server software you are running, change the default error messages and other identifiers. Often, when a "404 Not Found" error is detected, servers will return a page like that says something like:
Not FoundThe requested URL /cgi-bin/xxxxxx was not found on this server.Apache/1.3.27 Server at your web site Port 80
The only information that the legimitate user really needs is a message that says "Page Not found." Restricting the other information will prevent your page from turning up in an attacker's search for a specific flavor of server.
Google periodically purges it's cache, but until then your sensitive files are still being offered to the public. If you realize that the search engine has cached files that you want to be unavailable to be viewed you can go to ( http://www.google.com/remove.html ) and follow the instructions on how to remove your page, or parts of your page, from their database.