Internet Archive crawler-bot just brought down my server
Who runs the archive.org_bot??? They’ve just brought down the dokk server with tens of requests per second. I understand their needs for archiving, but slow down guys! I only have a tiny server!!!
Are there so many links?
well, yes, there are thousands of links
Thousands… without including library? Just curios do they try to archive pdfs and such things?
Thousands in the entire website (I don’t know the precise number).
The bot was downloading the whole website, by requesting between 20-50 pages per second.
That looks like a DDOS attack disguised. Maybe related to the incident of lots of registrations (https://freepo.st/post/ZJNAChJ0cc). Usually you sleep 1 or 2 seconds to avoid bringing down the server. They could have chosen that name for the crawler, I doubt Internet Archive would request 20-50 pages per second. I wonder why someone would want to attack this site. 🤔
Good points.
For now they have stopped, and I already have so much work to do with the library and the documentation that I don’t want to spend time investigating this further. Let’s hope it won’t happen again!