Discussion:
[Bug-wget] [bug #51181] Unexpected "Redirecting output to 'wget-log'."
Peter Wu
2017-06-04 18:45:14 UTC
Permalink
URL:
<http://savannah.gnu.org/bugs/?51181>

Summary: Unexpected "Redirecting output to 'wget-log'."
Project: GNU Wget
Submitted by: lekensteyn
Submitted on: Sun 04 Jun 2017 06:45:12 PM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Originator Name:
Originator Email:
Open/Closed: Open
Discussion Lock: Any
Release: 1.19.1
Operating System: GNU/Linux
Reproducibility: Every Time
Fixed Release: None
Planned Release: None
Regression: Yes
Work Required: None
Patch Included: None

_______________________________________________________

Details:

Since upgrading to wget 1.19.1 from wget 1.18 on Arch Linux, the normal
progress output is omitted and instead a "wget-log" file is created in some
circumstances.

Not sure how exactly this is triggered, I can only reproduce this when
starting wget in an initrd.

Steps to reproduce:
1. Create an initrd with "/init" consisting of the script below.
2. Boot the initrd, for example with:
qemu-system-x86_64 -m 1G -M pc,accel=kvm -kernel /boot/vmlinuz-linux -initrd
initrd.gz -nographic -serial stdio -monitor none -append console=ttyS0

Expected output:
A status message such as:
--2017-06-04 20:38:57-- http://127.0.0.1/
Connecting to 127.0.0.1:80... failed: Connection refused.

Actual output:
Redirecting output to 'wget-log'.

Other information:
I triggered this problem in my bootstrap script
https://github.com/Lekensteyn/archdir. Another affected user report:
https://unix.stackexchange.com/q/363765/8250
It could be a regression from commit v1.18-84-gdd5c549f


#!/bin/sh
export PATH=/usr/bin:/usr/sbin
/bin/busybox --install -s
ip link set lo up
mkdir /new
mount -t tmpfs none /new
cp -a /lib* /*bin /usr /etc /new/
chroot /new /usr/bin/wget -O - 127.0.0.1 || :

# Power down
mkdir /proc
mount -t proc none /proc
echo > /proc/sysrq-trigger o
echo should not be reached





_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
NoëlKöthe
2017-09-17 19:39:45 UTC
Permalink
Follow-up Comment #1, bug #51181 (project wget):

an additional comment from a Debian user:

--8<--
This has hit many and is already reported in several distributions, but not in
Debian yet.

Since 1.19, when run in the background, even with --quiet, wget creates a log
file wget-log in the current directory, which is normally empty.
If wget-log exists, it creates wget-log.1, and so on.

The workaround is to use -o /dev/null, but this changed behaviour breaks
existing scripts and is undocumented.
--8<--
https://bugs.debian.org/874590

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
anonymous
2017-10-27 17:35:17 UTC
Permalink
Follow-up Comment #2, bug #51181 (project wget):

This affects me, too. Please fix.


_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
Tim Ruehsen
2017-10-27 20:29:35 UTC
Permalink
Update of bug #51181 (project wget):

Status: None => Confirmed

_______________________________________________________

Follow-up Comment #3:

From the mailing list (21./22.5.2017):

If you use an explicit logfile (-o / --output-file), that code for
creating 'wget-log' isn't triggered.

It is a documented behavior, see 'man wget':
" -b
--background
Go to background immediately after startup. If no output
file is specified via the -o, output is redirected to wget-log.
"

Maybe the implementation was buggy and didn't match the docs - and since 1.19
it has been fixed ?

Anyways, we want to keep backward compatibility and if that change breaks
existing scripts, we should revert the change and change the docs (I am sure
people dislike that as well ;-)).

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
Peter Wu
2017-12-27 12:58:00 UTC
Permalink
Additional Item Attachment, bug #51181 (project wget):

File name: 0001-Avoid-redirecting-output-to-file-when-tcgetpgrp-fail.patch
Size:1 KB


_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
Peter Wu
2017-12-27 13:08:00 UTC
Permalink
Follow-up Comment #4, bug #51181 (project wget):

I have investigated the issue, found the issue and attached a patch for it.

In the provided test case, file descriptors std{in,out,err}
(/proc/self/fd/{0,1,2}) seem to refer to /dev/console. This happens even
without chroot.

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
Tim Ruehsen
2017-12-31 12:04:54 UTC
Permalink
Update of bug #51181 (project wget):

Status: Confirmed => Fixed
Open/Closed: Open => Closed
Planned Release: None => 1.19.3

_______________________________________________________

Follow-up Comment #5:

Thanks, applied.

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
J
2018-06-15 17:32:15 UTC
Permalink
Follow-up Comment #6, bug #51181 (project wget):

Issue reproduced 1.19.4 on Ubuntu

Any news when fix will be released please?

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Peter Wu
2018-06-15 19:33:39 UTC
Permalink
Follow-up Comment #7, bug #51181 (project wget):

This was already fixed in wget 1.19.3, are you sure that you are using this
(or a newer) version?

You really have to provide more details on your environment and the steps to
reproduce. See for example the original description.

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
J
2018-06-16 16:39:10 UTC
Permalink
Follow-up Comment #8, bug #51181 (project wget):

Hi Peter Wu

I'm using Latest ubuntu LTS, which shows version 1.19.4. Logs below. Is it
something to do with '&' ampersand in a URL?



$ dpkg -l |grep wget
ii wget 1.19.4-1ubuntu2.1

$ wget --version
GNU Wget 1.19.4 built on linux-gnu.


$ wget https://uk.godaddy.com/dpp&key
[1] 5080

Redirecting output to ‘wget-log.3’.

Command 'key' not found, but can be installed with:

sudo apt install donkey

***@asus:~/test$




_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Peter Wu
2018-06-16 17:34:29 UTC
Permalink
Follow-up Comment #9, bug #51181 (project wget):

"&" is a special shell character which causes a program to go to the
background. When you execute

wget https://example.com/dpp&key

it will actually be interpreted as:

wget https://example.com/dpp &
key

which will execute "wget https://example.com/foo" to download that URL in the
background (due to "&"). After that it will execute the command "key" (which
it cannot find in your case).

To have the intended effect of downloading that particular URL, quote your URL
instead such that the special shell characters are not intepreted:

wget "https://example.com/dpp&key"

See also https://www.tldp.org/LDP/abs/html/quoting.html

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
J
2018-06-16 18:05:59 UTC
Permalink
Follow-up Comment #10, bug #51181 (project wget):

Hi Peter

Ok, sorry my example was bad. This is my actual example below.

This program reproduces the issue

***@asus:~/code$ g++ -O2 -Wall -Wextra -Wpedantic -o main main3.cpp
***@asus:~/code$ ./main

Redirecting output to ‘wget-log.2’.
***@asus:~/code$





//g++ -O2 -Wall -Wextra -Wpedantic -o main main.c

#include <string>
#include <stdio.h>
#include <stdlib.h>

int main (void)
{
std::string str = "timeout -k 26s 25s wget --output-document dump.html
http://TajInternational.com/";

int result = system(str.c_str());

return result;
}

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Tim Ruehsen
2018-06-16 18:50:51 UTC
Permalink
Update of bug #51181 (project wget):

Status: Fixed => Confirmed
Open/Closed: Closed => Open
Release: 1.19.1 => 1.19.5
Planned Release: 1.19.3 => 1.19.6

_______________________________________________________

Follow-up Comment #11:

The 'timeout' command puts wget into background.

If you either leave it away or use --foreground with 'timeout', you won't see
wget-log. You can use --timeout=25 for wget if you need a timeout.

But that's a work-around. The issue is that wget behaves contrary to the
manual which says "--background: If no output file is specified via the -o,
output is redirected to wget-log".

I re-open the issue.

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Darshit Shah
2018-06-16 21:55:52 UTC
Permalink
Follow-up Comment #12, bug #51181 (project wget):

Actually, I am unable to reproduce the problem.

`$ timeout -k 26s 25s wget example.com`

does _not_ put Wget in the background. The entire task runs in the
foreground.

And even when wget does run in the background, I don't see how the manual is
incorrect. It says, wget will download to `wget-log`, but if the local file
already exists, due to no-clobbering, Wget will create a unique filename by
appending a counter.

I just don't see what is wrong here

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Peter Wu
2018-06-17 11:22:34 UTC
Permalink
Follow-up Comment #13, bug #51181 (project wget):

I can only reproduce it with fork before execve *and* when using the "timeout"
command:

#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>

int main() {
char *args[] = {
"timeout", "-k", "26s", "25s",
"wget", "-O", "test.html", "http://example.com", NULL
};
int pid = fork();
if (pid < 0) {
perror("fork");
return 1;
} else if (pid == 0) {
execvp(args[0], args);
perror("execvp");
return 1;
} else {
wait(NULL);
perror("wait");
}
return 0;
}

It appears that "timeout" is creating a new process group:

977 if (foreground_pgrp != -1 && foreground_pgrp != getpgrp ())
(gdb) p foreground_pgrp
$4 = 12905 # pidof wrapper
(gdb) p (int)getpgrp()
$5 = 12906 # pidof timeout

Suggested workarounds:
- Use "timeout --foreground" or,
- Add the "-o-" option to "wget" to force logging to stdout.

Given the special behavior of "timeout", isn't this "working as intended"?

_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Tim Ruehsen
2018-06-17 16:34:49 UTC
Permalink
Follow-up Comment #14, bug #51181 (project wget):

Sorry, I forgot to mention that I used J's system() call.
Post by Peter Wu
Given the special behavior of "timeout", isn't this "working as intended"?
I see wget-log being created with J's C code though --output-file was given
(-o dump.html):

#include <stdlib.h>
int main (void) {
int result = system("timeout 25 src/wget -o dump.html
http://google.com/");
return result;
}

$ gcc x.c -o x
$ ./x

Redirecting output to ‘wget-log.1’.



_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
J
2018-06-17 20:58:48 UTC
Permalink
Follow-up Comment #15, bug #51181 (project wget):

Hi Tim
I could not get your code to compile.


My simpler C version of my test case below

What I woudl suggest is, could this be added to wget testsuite? some
additional tests

//gcc -O2 -Wall -Wextra -Wpedantic -o main wget_main.c
#include <stdlib.h>
int main (void)
{
const char * str = "timeout -k 26s 25s wget http://bbc.com/";
int result = system(str);

return result;
}




_______________________________________________________

Reply to this item at:

<http://savannah.gnu.org/bugs/?51181>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Loading...