Well after having decided I don't want to run gallery2 any more I have another question related to it.
I'm getting hundreds of hits (well, attempted hits, it's no longer there) on the old gallery2 site, logged by logwatch as follows:-
/mandy/gallery2/main.php?g2_view=shutterfl ... en=26321822fb67: 1 Time(s)
I seem to get thirty or forty every day and today's logwatch has logged well over a hundred.
It's not a issue really as it's not eating much bandwidth or loading my web site significantly but I was just wondering if anyone knows whether it's a well know security weakness in gallery2 or what? Or is someone just very desperate to look at the pictures on Mandy's blog? :-)
Maybe I should put a note at /mandy/gallery2 saying that the pictures have moved. Though come to think about it the link from the blog now points at where the pictures really are so whatever is hitting that gallery2 URL must be pretty mindless.
On 09/01/12 11:59, Chris Green wrote:
Well after having decided I don't want to run gallery2 any more I have another question related to it.
I'm getting hundreds of hits (well, attempted hits, it's no longer there) on the old gallery2 site, logged by logwatch as follows:-
It's not a issue really as it's not eating much bandwidth or loading my web site significantly but I was just wondering if anyone knows whether it's a well know security weakness in gallery2 or what? Or is someone just very desperate to look at the pictures on Mandy's blog? :-)
I find that the google webcrawler bot, and a few others, keep hitting a url once they find it even though it may be long gone and now gives a 404. The bots seem unable to take the hint.
On Mon, Jan 09, 2012 at 02:35:29PM +0000, nev young wrote:
On 09/01/12 11:59, Chris Green wrote:
Well after having decided I don't want to run gallery2 any more I have another question related to it.
I'm getting hundreds of hits (well, attempted hits, it's no longer there) on the old gallery2 site, logged by logwatch as follows:-
It's not a issue really as it's not eating much bandwidth or loading my web site significantly but I was just wondering if anyone knows whether it's a well know security weakness in gallery2 or what? Or is someone just very desperate to look at the pictures on Mandy's blog? :-)
I find that the google webcrawler bot, and a few others, keep hitting a url once they find it even though it may be long gone and now gives a 404. The bots seem unable to take the hint.
You're right! All the recent accesses are from google. I'm sure when I looked a few days ago they weren't but they certainly are now. So I'm even less worried than I was (which wasn't very).
On 09/01/12 14:46, Chris Green wrote:
On Mon, Jan 09, 2012 at 02:35:29PM +0000, nev young wrote:
I find that the google webcrawler bot, and a few others, keep hitting a url once they find it even though it may be long gone and now gives a 404. The bots seem unable to take the hint.
You're right! All the recent accesses are from google. I'm sure when I looked a few days ago they weren't but they certainly are now. So I'm even less worried than I was (which wasn't very).
Could it be that Google have noticed that there's nothing there and are scanning it frequently to find when it comes back?
Anyway, you may be able to stop it with a robots.txt file in the root of your website. Personally, I'd think that a robots.txt file is a good idea on any website that has bits you don't want search engines to hit, even if some web-crawlers don't honour it.
http://en.wikipedia.org/wiki/Robots_exclusion_standard
HTH Steve
On 09/01/12 22:17, steve-ALUG@hst.me.uk wrote:
On 09/01/12 14:46, Chris Green wrote:
On Mon, Jan 09, 2012 at 02:35:29PM +0000, nev young wrote:
I find that the google webcrawler bot, and a few others, keep hitting a url once they find it even though it may be long gone and now gives a 404. The bots seem unable to take the hint.
You're right! All the recent accesses are from google. I'm sure when I looked a few days ago they weren't but they certainly are now. So I'm even less worried than I was (which wasn't very).
Could it be that Google have noticed that there's nothing there and are scanning it frequently to find when it comes back?
Could be but after a few months you'd think they'd give up. Google can be told to stop crawling dead pages via their webmaster tools pages. Although I've given up doing that.
Anyway, you may be able to stop it with a robots.txt file in the root of your website. Personally, I'd think that a robots.txt file is a good idea on any website that has bits you don't want search engines to hit, even if some web-crawlers don't honour it.
robots.txt is a two edged sword.
Good bots keep out when told. Bad bots ignore it and enter anyway. Blackhats are alerted that you have a page you don't want them to see.
(hmmm maybe that's a 3 edged sword).
On Mon, Jan 09, 2012 at 10:17:32PM +0000, steve-ALUG@hst.me.uk wrote:
On 09/01/12 14:46, Chris Green wrote:
On Mon, Jan 09, 2012 at 02:35:29PM +0000, nev young wrote:
I find that the google webcrawler bot, and a few others, keep hitting a url once they find it even though it may be long gone and now gives a 404. The bots seem unable to take the hint.
You're right! All the recent accesses are from google. I'm sure when I looked a few days ago they weren't but they certainly are now. So I'm even less worried than I was (which wasn't very).
Could it be that Google have noticed that there's nothing there and are scanning it frequently to find when it comes back?
Anyway, you may be able to stop it with a robots.txt file in the root of your website. Personally, I'd think that a robots.txt file is a good idea on any website that has bits you don't want search engines to hit, even if some web-crawlers don't honour it.
Ah, yes, I used to have a robots.txt file, maybe I should reinstate it.