Steve Smith, host of your TQA Weekly, explains how to search for a document or page within a web-site, despite the web-site not having it's own search engine, and also explains why this may not work.
Episode #2-37 released on June 10, 2012
Ever wanted to search a web-site for a specific article you can't find anymore, but that web-site didn't have a search engine to use to find it? Today, on TQA Weekly, I'm going to explain how to search a web-site using Google, and only get results related to a specific domain name, and as a bonus, I'm going to give you the code, to give that user access to one of the most powerful search engines on the internet, Google.
First, let's talk about search engines, so we understand how they may get the search results in the first place. Search engines scan web-sites linked to other web-sites using bots. These legitimate bots read the contents of a robot.txt file, the meta tag headers, and if available, the contents of the site maps. These are the basics of search engine optimization, and not all are required to be able to search a web-site. The issue is that someone needs to link to your web-site of interest for it to be found, if the user hasn't submitted their web-site to be indexed.
For any new content to show up in the search results, it may take up to a week before the web-site is re-crawled, which means that some age to the document may be necessary for it to be search-able, however, search engines may comeback more often on web-sites that continuously pump out new content, all the time.
Now, at any point the web developer, for any reason they choose, may want to prevent the web-site from being indexed, which renders searching for it through outside means, impossible. Although this is really rare, Facebook, is an example of a web-site that is not search-able beyond the front page. This was a move to protect the privacy of the users and prevent them from being easily search-able.
To search a web-site, let's take ours for example, you may want to learn how to use Google, and other search engines more precisely. In order to search a specific domain address, you need to add the following to your search query:
search query site:domain address
If the web-site has a lot of content, you may want to use an even more precise method of defining the domain address and data range, alongside of the search query, like this:
search query 2012..2012 site:domain address
Provided the web-site is search-able, and the article your looking for made it in the last bot scan, you should be able to find what your looking for very easily.
Now if the web-site doesn't have a search engine in place, you may offer the one we currently use for Google's Search Engine. The code is down below.
<form method="get" action="http://www.google.com/search">
<input type="hidden" name="sitesearch" value="domainnamehere.com"/>
<input type="text" name="q" size="24" maxlength="255" value="">
<input type="submit" value="Search Site">
Next week, I'll be explaining what a Hash is, and more importantly, why they need to be salted. And, as an added bonus, I've started a new contest called All Coloured Salts, which will help you discover the world of hashing and salts for yourselves. The first person to submit the solution for the salted hash, will win a $100 Amazon Gift card, provided you answer the skill testing question within the hash correctly. To participate in the contest, head over to tqaweekly.com/contests, read the rules, and get started in the new contest.
Remember to like, share and subscribe to TQA Weekly. For more information like our show notes, how to join our mailing list, get your own TQA Weekly branded gear and apparel, or for our Android Application, please visit tqaweekly.com. Stay safe and online, have a great day!
Host : Steve Smith | Music : Jonny Lee Hart | Editor : Steve Smith | Producer : Zed Axis Productions