Jamie Zawinski
2006-07-11 20:29:24 UTC
wget 1.10.2
MacOS 10.4.7 Intel
I'm trying to download a file whose URL contains Japanese characters.
If I specify -O, it is able to download the data; but if wget is
picking the file name itself, it is unable to write the file
("invalid argument"). Neither --restrict-file-names=unix nor --
restrict-file-names=windows affects it.
I guess wget and the OS disagree about what characters can go in file
names?
I also tried setting $LANG and $LOCALE to "C" to no effect.
This is a default HFS+ file system, running in the American English
locale.
% wget -d 'http://somehost/~somewhere/music/Dir%20en%20grey/%e9%
ac%bc%e8%91%ac/01%20%e9%ac%bc%e7%9c%bc%20-kigan-.m4a'
DEBUG output created by Wget 1.10.2 on darwin8.6.1.
--13:20:52-- http://somehost/~somewhere/music/Dir%20en%20grey/%
e9%ac%bc%e8%91%ac/01%20%e9%ac%bc%e7%9c%bc%20-kigan-.m4a
=> `01 鬼ç%9C¼ -kigan-.m4a'
Resolving [...]
Caching [...]
Connecting to [...]|:80... connected.
Created socket 4.
Releasing 0x00507520 (new refcount 1).
---request begin---
GET /~somewhere/music/Dir%20en%20grey/%e9%ac%bc%e8%91%ac/01%20%
e9%ac%bc%e7%9c%bc%20-kigan-.m4a HTTP/1.0
Accept: */*
Authorization: Basic [...]
Host: [...]
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Tue, 11 Jul 2006 20:19:36 GMT
Server: Apache/1.3.33 (Darwin) mod_perl/1.29
Last-Modified: Thu, 22 Dec 2005 20:03:49 GMT
ETag: "1517d-5ab8c7-43ab06a5"
Accept-Ranges: bytes
Content-Length: 5945543
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: audio/mp4a-latm
---response end---
200 OK
Registered socket 4 for persistent reuse.
Length: 5,945,543 (5.7M) [audio/mp4a-latm]
01 鬼ç%9C¼ -kigan-.m4a: Invalid argument
Disabling further reuse of socket 4.
Closed fd 4
Cannot write to `01 鬼ç%9C¼ -kigan-.m4a' (Invalid argument).
Exit 1
"touch" also fails with that file name:
touch: 01 鬼ç%9C¼ -kigan-.m4a: Invalid argument
--
Jamie Zawinski ***@jwz.org http://www.jwz.org/
***@dnalounge.com http://www.dnalounge.com/
http://jwz.livejournal.com/
MacOS 10.4.7 Intel
I'm trying to download a file whose URL contains Japanese characters.
If I specify -O, it is able to download the data; but if wget is
picking the file name itself, it is unable to write the file
("invalid argument"). Neither --restrict-file-names=unix nor --
restrict-file-names=windows affects it.
I guess wget and the OS disagree about what characters can go in file
names?
I also tried setting $LANG and $LOCALE to "C" to no effect.
This is a default HFS+ file system, running in the American English
locale.
% wget -d 'http://somehost/~somewhere/music/Dir%20en%20grey/%e9%
ac%bc%e8%91%ac/01%20%e9%ac%bc%e7%9c%bc%20-kigan-.m4a'
DEBUG output created by Wget 1.10.2 on darwin8.6.1.
--13:20:52-- http://somehost/~somewhere/music/Dir%20en%20grey/%
e9%ac%bc%e8%91%ac/01%20%e9%ac%bc%e7%9c%bc%20-kigan-.m4a
=> `01 鬼ç%9C¼ -kigan-.m4a'
Resolving [...]
Caching [...]
Connecting to [...]|:80... connected.
Created socket 4.
Releasing 0x00507520 (new refcount 1).
---request begin---
GET /~somewhere/music/Dir%20en%20grey/%e9%ac%bc%e8%91%ac/01%20%
e9%ac%bc%e7%9c%bc%20-kigan-.m4a HTTP/1.0
Accept: */*
Authorization: Basic [...]
Host: [...]
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Tue, 11 Jul 2006 20:19:36 GMT
Server: Apache/1.3.33 (Darwin) mod_perl/1.29
Last-Modified: Thu, 22 Dec 2005 20:03:49 GMT
ETag: "1517d-5ab8c7-43ab06a5"
Accept-Ranges: bytes
Content-Length: 5945543
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: audio/mp4a-latm
---response end---
200 OK
Registered socket 4 for persistent reuse.
Length: 5,945,543 (5.7M) [audio/mp4a-latm]
01 鬼ç%9C¼ -kigan-.m4a: Invalid argument
Disabling further reuse of socket 4.
Closed fd 4
Cannot write to `01 鬼ç%9C¼ -kigan-.m4a' (Invalid argument).
Exit 1
"touch" also fails with that file name:
touch: 01 鬼ç%9C¼ -kigan-.m4a: Invalid argument
--
Jamie Zawinski ***@jwz.org http://www.jwz.org/
***@dnalounge.com http://www.dnalounge.com/
http://jwz.livejournal.com/