Giovanni Porta
2018-09-24 05:38:35 UTC
Hello all,
For the past week or so, I've been attempting to mirror a website with Wget. However, after a couple days of downloading (and approx 38 GB downloaded), Wget eventually exhausts all system memory and swap leading to the process getting killed. The server I'm using has 2GB of RAM and 2GB of swap.
I'm using Ubuntu 16.04, initially with Wget 1.17.1, however I have also compiled and tried the newest version, 1.19.5, after reading this bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642563
My luck has not changed since updating, and I'm at a loss of what else to do. Surely it isn't normal for the memory usage to climb like this?
These are the parameters I'm using to download:
wget --load-cookies cookies.txt --warc-file="site" -mirror --convert-links --adjust-extension --page-requisites --random-wait --accept-regex ".*(ubb=cfrm)|(ubb=postlist)|(ubb=showflat)|(images)|(styles)|(ubb_js).*" --restrict-file-names=windows "https://example.com"
Here is the log from dmesg leading up to the process being killed: https://pastebin.com/gR4cGQdA
Any ideas?
Thanks.
Gio
For the past week or so, I've been attempting to mirror a website with Wget. However, after a couple days of downloading (and approx 38 GB downloaded), Wget eventually exhausts all system memory and swap leading to the process getting killed. The server I'm using has 2GB of RAM and 2GB of swap.
I'm using Ubuntu 16.04, initially with Wget 1.17.1, however I have also compiled and tried the newest version, 1.19.5, after reading this bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642563
My luck has not changed since updating, and I'm at a loss of what else to do. Surely it isn't normal for the memory usage to climb like this?
These are the parameters I'm using to download:
wget --load-cookies cookies.txt --warc-file="site" -mirror --convert-links --adjust-extension --page-requisites --random-wait --accept-regex ".*(ubb=cfrm)|(ubb=postlist)|(ubb=showflat)|(images)|(styles)|(ubb_js).*" --restrict-file-names=windows "https://example.com"
Here is the log from dmesg leading up to the process being killed: https://pastebin.com/gR4cGQdA
Any ideas?
Thanks.
Gio