So lets start at the beginning, what is a hit? A hit is basically when someone makes a request to your site. So a hit includes the actual page itself, and all the images that it uses. A hit is the number of requests a browser had to make to get your site. This is the purist form of hit. At Blog-City a hit is simply defined as a single page request. While all the other 'hits' (images, files, movies, photos etc) are counted we do not include them in your stats, this information is used internally.
Lets put some numbers here for example. Consider a page with no images on it as oppose to a page that has 30 pictures on it (not unreasonable, just look at the home page of Amazon for example). Imagine a single person hitting once each page. The first one will generate 1 hit, and the second one will generate 31 hits. Which is the more popular site? Well in the early dot-com days, marketeers would have you believe that the second site was, because it had 30 more hits than the first. But as we know it's a meaningless figure. So hits became known as 'page-requests' and that removed a lot of the confusion. At Blog-City world every time we say hit, we mean 'page-request'. The reason is that 'page-request' never really caught on as a buzzword outside of the technical world.
Page-requests are going through the same sort of confusion that the early hit counts did; artifically inflated numbers. The reason for this is down largely to the search engines. Have you noticed that you don't need to submit/add your site to any search engines these days? Due to some witch craft, they magically appear in search results. How do you think they appear there if no-one is adding them? The search engine has to crawl/spider your site. That means they got to your site by someone first linking to it, reading that page indexing the content, and keeping a note of all the URL's on that page as more pages it should go and visit. And so the spiral continues. Again lets look at an example.
Google finds your URL by some how (some link on some other site) and it goes and reads that page. While it's indexing your content into its search engine, it's keeping a note of all the URL's you refer to on your site; including links to external sites and bookmarks to internal blogs on your own site. It will then start the cycle again, visiting each new link it found. This is called spidering, or crawling your site. Each time this happens, a 'hit' or 'page-request' is generated. Now, the majority of bloggers do not update all their content all the time. They merely keep adding content. But they have links to all their previous blog entries. So when Google (Google being the generic name for spiders) comes along and reads your latest blog entry, due to the fact you can link to all your other blogs, it will go and read these blog entries again. But they haven't changed. Why do they need indexed again? There is one avenue where Blog-City has improved the technology; we don't permit them to re-spider if the content hasn't changed from the last time they read it. This is controlled by the spider giving us the last date they read a particular entry and we can then compare it and make the decision there and then as to whether or not they get to read it again. If they don't need to read it again, then a 'page-request' or 'hit' is not registered.
But does that mean you slipped out of Google? No. In fact, do a search for your blog on Google and you'll see.
Now, as you can appreciate by removing these extra unnecessary hits, Blog-City have significantly removed the redundancy away from the servers, which means less bandwidth.
Onto the second part, rogue spiders.
A rogue spider is basically not a real person, and doesn't obey the guidelines of the web that spiders follow. These are generally software in development and spam-crawlers looking for email addresses. Our system dynamically monitors who is doing what, and then capture these guys as and when it's happening and restrict their access. We don't completely deny their request, merely slow them down. So all these repeated or rogue requests don't count. They add no real value to you since they weren't real people reading your blog. Merely automated bots who in some instances are not even harvesting your content for aggregation. At the end of the day, our most important person is the person reading your blog. They are the ones that we all care about.