I've been in contact with a website which hosts a large range of images (www.bioimages.org.uk). The images are "a large selection of pictures of Natural History objects, mostly British in origin", which currently amount to around 400M of HTML (autogenerated from a database) on IIS, and 3-4GB of photographs hosted separately.
The site is currently offline due to attempted site rips causing load and bandwidth problems. I've offered to help look at the problem, but due to the capacity required I don't think I'm in a position to help with the hosting, although I may be able to help technically.
Several issues comes to mind which I'd like advice on. I think there are Apache modules which can help to control the abuse aspect, but I'm not sure where to start - any suggestions? There are options like scripting the site (eg using PHP) and limiting the number of downloads per session. I could probably come up with a solution but its not something I have experience in managing.
If anyone has suggestions for a generous host that would be useful too. What he's currently looking for is hosting for the HTML but not the image library, but cites bandwidth as the biggest problem. I need to work out why that is - I'd be surprised if the bandwidth fotr the HTML itself is a huge problem (although I think the way the pages are generated means that they are static pages with new timestamps after each generation, so search engine traffic alone could be massive).
I have no direct connection with the site; its a friend who manages a site which makes reference to the images that brought the problem to my attention. I am inclined to help because I think this sort of thing is what the web should be about (ie decent libraries of decent information).