[Bug-wget] Wget running out of memory

Tim Rühsen

2018-09-24 13:31:53 UTC

Hi Giovanni,

Post by Giovanni Porta
Hello all,
For the past week or so, I've been attempting to mirror a website with Wget. However, after a couple days of downloading (and approx 38 GB downloaded), Wget eventually exhausts all system memory and swap leading to the process getting killed. The server I'm using has 2GB of RAM and 2GB of swap.
I'm using Ubuntu 16.04, initially with Wget 1.17.1, however I have also compiled and tried the newest version, 1.19.5, after reading this bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642563
My luck has not changed since updating, and I'm at a loss of what else to do. Surely it isn't normal for the memory usage to climb like this?

It surely isn't "normal", but your use case maybe isn't normal as well
(*days* of downloading, 38GB).

Recursive downloads require wget to keep each downloaded URL in memory,
to not download those again and again. Additionally, --convert-links
requires more memory to track data from parsing. Then, if the server
sends new cookies for each visited page, you have again additional
memory consumption since these are all kept in memory.

All this sums up over time, and 2GB simply isn't enough for your task.

Post by Giovanni Porta
Any ideas?

- get more RAM
- split that one huge download into several smaller ones

That's all I can come up with :-)

Regards, Tim