4, to prevent people posing as spider love Shanghai. If your site bandwidth congestion, caused by this phenomenon may be due to people posing as Baidu spider malicious grab. If >
is a search engine inside a URL index database, so the search engine spiders is starting from the search engine server, along the existing web site search engines crawl a page, and the page content to crawl back. After the page search engine will be collected, analyzed, separate the content and links, content temporarily do not say first. After the analysis of the link, the search engine does not immediately sent spider capture, but the links and the anchor text recorded to site index database analysis, comparison and calculation, finally into the index library. After entering the URL of the index, there will be a spider to crawl.
1, in general, Baiduspider on the web server will not cause too much pressure. Baiduspider automatically according to the load capability of the server access density adjustment. In the continuous access to a period of time, Baiduspider will be suspended for a while, in order to prevent the increased server access pressure. So in general, Baiduspider on your web server will not cause too much pressure.
: understanding love spiders in Shanghai
3, if you want to love Shanghai web content to be indexed but are not saved snapshot, you can use "meta settings, make love in Shanghai only to build the index page, but not in the search results display the snapshot of the web page.
love Shanghai spiders, English name is "Baiduspider", is an automatic program search engine love Shanghai. Its role is to access the Internet on HTML web, database indexing, users can search in the search engine to the site in Shanghai love ".
2, do not want to be accessed from Baiduspider website, you can use the robots.txt file to completely ban Baiduspider visit your site, or Baiduspider to prohibit access to some files on your site. Note: no Baiduspider visit your site, will make your web site in the search engine, love Shanghai and all love Shanghai search engines to provide search engine services can not be searched.
is if there is a chain of a web page, will not immediately have the spider to grab the page, but there will be a process of analysis and calculation. Even if the chain is deleted in the spider crawl, this link may have been recorded after the search engine, and can crawl. And next time if the spider to grab the chain page, found the link does not exist, or the chain page where there were 404, so just reduce the weight of the chain, should not to delete this link URL index database.