SCRIPT keyword for urlopen

AmigaOS users can make feature requests in this forum.
User avatar
Raziel
Posts: 1170
Joined: Sat Jun 18, 2011 4:00 pm
Location: a dying planet

Re: SCRIPT keyword for urlopen

Post by Raziel »

@JosDuchIt
CMDFORMAT = "URL=http://%s"
; The exec RawDoFmt() format string specifier, where the
; token '%s' will be the command string.
; The command string will have the protocol part removed.
Taken from the URL Prefs text (found with Notepad)

See the last line?
About the protocol part being removed?

The DEBUG switch clearly says that it will display the line just before it's send out to the script or program and it told me that the http:// part was still there...maybe it's "lying" and removes that part AFTER the debug window shows the commandline?

Then again, using the script (as the debug window states) should fix that missing http:// again once it's processed by it.
But it seems it doesn't

Out of ideas here
People are dying.
Entire ecosystems are collapsing.
We are in the beginning of a mass extinction.
And all you can talk about is money and fairytales of eternal economic growth.
How dare you!
– Greta Thunberg
User avatar
colinw
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 207
Joined: Mon Aug 15, 2011 9:20 am
Location: Brisbane, QLD. Australia.

Re: SCRIPT keyword for urlopen

Post by colinw »

The DEBUG switch clearly says that it will display the line just before it's send out to the script or program and it told me
that the http:// part was still there...maybe it's "lying" and removes that part AFTER the debug window shows the commandline?
The last line in the debug requester IS the exact line that is executed in an asyncronous shell.

When you enable the debug switch, it's showing what launch-handler is doing with the supplied info, but there's more
happening if you are calling the C:URLOpen command, so lets go over this step by step and i'll explain how it all works
and what happens within the various parts...

First of all, there's the L:launch-handler that is responsible for the mounted URL: device, and then there's the C:URLOpen
command, the URL: device is called by C:URLOpen, although you can also invoke the URL: device yourself without C:URLOpen,
if you happen to be an application that can call the DOS Open() function itself.

For simplicity, lets just start with the C:URLOpen command that is used in a shell, or can be used as a default tool
via workbench, and what its job is in this situation.

If I were to type (in a shell) the following line; "urlopen http://www.google.com", then the C:URLOpen command
doesn't do much, here it will just check to see if the command line argument string (the "http://www.google.com" part)
starts with "URL:" and if not, it prepends "URL:" to the front of what you supplied, it then calls DOS Open().
That's all it does, it just provides an executable command interface to the DOS Open() function for passing command
string to the URL: device.

By calling the DOS Open() function with the command line; "URL: http://www.google.com", DOS will identify the
handler by the first part up to the first colon character, in this case, the "URL:" which means L:launch-handler,
like the same way any path starting with "RAM:" ends up being handled by the ram-handler.

At this stage, L:launch-handler will have received the string; "URL: http://www.google.com".
It immediately strips off the handler identifier from what it received, leaving; "http://www.google.com".
It then parses the remainder stripping off the first part to identify the protocol, leaving; "www.google.com".

Now, with the protocol "http" identified, and the rest "www.google.com", it reads the file "HTTP.LH" (where the
name was derived from protocol), found in the "ENV:launch-handler/URL" directory to find out what to do with it.

With the information in there, it builds a requester with all the "ClientName" arguments, and creates a command line
for the client program (browser) specified by the "ClientPath" arguments, then uses the "CMDFORMAT" string as the format
to create the final browser command line that will be passed to the client when it is run by L:launch-handler.

In this case; "*"http://%s*"" where %s will be replaced by the "www.google.com" string, and because the "http://"
part was removed in the initial parsing, it need to be put back on for the client and why it is specified again here.

The result is a commandline for the selected client which is shown as the last line in the debug requester.
User avatar
Raziel
Posts: 1170
Joined: Sat Jun 18, 2011 4:00 pm
Location: a dying planet

Re: SCRIPT keyword for urlopen

Post by Raziel »

@colinw

Thanks a lot for the explanation.

Due to your insights i did a few more tests with the DEBUG switch

I found the error...i'm so bold as to blame URLOpen has a bug :-)

Reason:
This is the url in question:
This is what debug shows as arguments/commandline when using urlopen with the given url in quotes (urlopen "http://the-url")
Image
As you can see everything is fine, the arguments and the commandline holds and sends the correct url.


Now for the problems.
This will be used as arguments/commandline when using urlopen with the given url WITHOUT quotes (urlopen http://the-url)
Image
As you can see the ARGUMENTS url is already wrong (after optiextension.dll?ID there was a SPACE inserted and the original "=" was eaten).
The same of course in the commandline. (The url is now wrongly word-wrapped at the place where the SPACE was put).

This "replacement" of the "=" with a SPACE character is leading to a changed url which in turn leads to Odyssey (and any other browser) trying to reach a non-functioning site and failing.

(I made the change bold, still not very good to see, sorry)
http://interactief.standaard.be/optiext ... l?ID[b]%20[/b]BZ4B_2HJEItndmwcAt_aR7AhjeRVTEJbGqTC7wMiRDt1vfWlsuLwzrLZlWT8lEbPk9_sttk2Z7hdxFGAqmIJGYiZtjzdb

So, urlopen eats away a "=" and adds a %20 (SPACE) instead...bad behavior :-)

Question is now, why does it do that?
And are there more characters that get replaced?

Could you please confirm?

Thank you very much
People are dying.
Entire ecosystems are collapsing.
We are in the beginning of a mass extinction.
And all you can talk about is money and fairytales of eternal economic growth.
How dare you!
– Greta Thunberg
JosDuchIt
Posts: 291
Joined: Sun Jun 26, 2011 5:47 pm
Contact:

Re: SCRIPT keyword for urlopen

Post by JosDuchIt »

@Raziel, i never noted the debug checkmark
@colinw
Thanks, can you also explain the use of astersiks and double quotes in the CMDFORMAT ?
User avatar
colinw
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 207
Joined: Mon Aug 15, 2011 9:20 am
Location: Brisbane, QLD. Australia.

Re: SCRIPT keyword for urlopen

Post by colinw »

JosDuchIt wrote:@Raziel, i never noted the debug checkmark
@colinw
Thanks, can you also explain the use of astersiks and double quotes in the CMDFORMAT ?
Without the quotes, the shell commandline parser will probably mess up some URL's, especially ones with spaces.

You want to read the DOS ReadLineItem() function autodoc regarding the asterisks as that function is used to
parse the config files for launch-handler. It is a requirement to be able to specify quotes inside a quoted string.

For quotes strings, ReadLineItem() allows you to substitute some characters within that can't be specified directly.
Escaped character substitutions:
*N returns 0x0a
*E returns 0x1b
** returns *
*" returns "
User avatar
Raziel
Posts: 1170
Joined: Sat Jun 18, 2011 4:00 pm
Location: a dying planet

Re: SCRIPT keyword for urlopen

Post by Raziel »

colinw wrote: Without the quotes, the shell commandline parser will probably mess up some URL's, especially ones with spaces.
That may be, but the SPACE isn't part of the original url...it's changed by URLOpen...
People are dying.
Entire ecosystems are collapsing.
We are in the beginning of a mass extinction.
And all you can talk about is money and fairytales of eternal economic growth.
How dare you!
– Greta Thunberg
User avatar
broadblues
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 600
Joined: Sat Jun 18, 2011 2:40 am
Location: Portsmouth, UK
Contact:

Re: SCRIPT keyword for urlopen

Post by broadblues »

Raziel wrote:
colinw wrote: Without the quotes, the shell commandline parser will probably mess up some URL's, especially ones with spaces.
That may be, but the SPACE isn't part of the original url...it's changed by URLOpen...
You *must* use that url with quotes. The = that is 'eaten' is most likely 'eaten' in the argument parsing it is *not* valid to have a '=' unquoted on the command line.

Call it like so;

urlopen "http://www.someurl.com?query=foo&blah=bar"

For illustration an example with another command COPY

Code: Select all


11.RAM Disk:> echo blah >"foo=bar"
11.RAM Disk:> list
foo=bar                           5 ----rwed Today     13:37:32
{snip}
4 files - 1819K bytes - 2 directories - 123 blocks used
11.RAM Disk:> copy foo=bar to bash
Cannot open "foo" for input - COPY: object not found
11.RAM Disk:> copy "foo=bar" to bash
11.RAM Disk:> list
bash                              5 ----rwed Today     13:39:19
foo=bar                           5 ----rwed Today     13:37:32
{snip}
5 files - 1819K bytes - 2 directories - 125 blocks used

See how without the quotes the file name is mangled. Quotes are always required when special characters are present, if you don't know if they will be present or not ,for example in a script, than add quotes by default.
User avatar
Raziel
Posts: 1170
Joined: Sat Jun 18, 2011 4:00 pm
Location: a dying planet

Re: SCRIPT keyword for urlopen

Post by Raziel »

@broadblues
The = that is 'eaten' is most likely 'eaten' in the argument parsing it is *not* valid to have a '=' unquoted on the command line.
I know AREXX is not URLPrefs, but just for example, AREXX is reading in every string putting it automatically in quotes.
(Like in our examples earlier in this thread, i can easily give the same url without quotes and it will still work, because it gets read in 1:1 and secured by adding quotes by AREXX)

Wouldn't it be feasable to change/adapt URLPrefs to also automatically add missing quotes to given strings/urls *while* parsing them?

I don't want to sound annoying, but it still looks to me as if URLPrefs is wrongly changing the user input -in the argument parsing-.
See how without the quotes the file name is mangled. Quotes are always required when special characters are present, if you don't know if they will be present or not ,for example in a script, than add quotes by default.
But, but, but...that is exactly what we are trying to achieve :-)
Please read the earlier posts again, we do have a working script where we take care of missing quotes and it works with starting the script directly (even giving the url WITHOUT quotes), but due to URLPrefs mangling our input the whole url passed to our script is already wrong and can't be fixed anymore by adding quotes, because the "=" was already changed ... by URLPrefs :-)
People are dying.
Entire ecosystems are collapsing.
We are in the beginning of a mass extinction.
And all you can talk about is money and fairytales of eternal economic growth.
How dare you!
– Greta Thunberg
User avatar
broadblues
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 600
Joined: Sat Jun 18, 2011 2:40 am
Location: Portsmouth, UK
Contact:

Re: SCRIPT keyword for urlopen

Post by broadblues »

I know AREXX is not URLPrefs,
No.
but just for example, AREXX is reading in every string putting it automatically in quotes.
It certainly doesn't do that at all, if you feed arexx and unquoted string it will parse it.

eg:

Code: Select all

9.AmigaOS4:> rx "say foo=bar"
0
9.AmigaOS4:> rx "say 'foo=bar'"
foo=bar
(Like in our examples earlier in this thread, i can easily give the same url without quotes and it will still work, because it gets read in 1:1 and secured by adding quotes by AREXX)
If you mean your script gets the argument string passed as a whole then yes, but your script is not using ReadArgs to parse it, it does not have a standard command line template. AREXX is *not* automatically adding quotes, eg

Code: Select all

10.AmigaOS4:> rx "parse arg bah; say bah" foo=bash=look=noquotes
foo=bash=look=noquotes
Wouldn't it be feasable to change/adapt URLPrefs to also automatically add missing quotes to given strings/urls *while* parsing them?
The problem, which is not actually a problem IMHO you should just quote the arguments with special characters in them, is in ReadArgs() not specifically URLOpen.

"=" is a special character in that you can put an = sign between a template keyword and it's value.

UrlOpen LINK=http://foo.com?blah=bash

instead of

UrlOpen LINK http://foo.com?blah=bash

but if you omit the keyword which is possible for any keyword that does not have a /K on it, then you are providing broken syntax to the ReadArgs line command parser if you put an '=' in the command argument.

You might argue that ReadArgs should detect this broken syntax and autofix it, (I'll let ColinW comment on that, you never know what consequances such autofixing might have an another genuine combination) but you should have quoted your argument in the first place, that what quotes are for!.

[edit]fixed my hordnous typing and few formatting errors...
xenic
Posts: 1185
Joined: Sun Jun 19, 2011 12:06 am

Re: SCRIPT keyword for urlopen

Post by xenic »

broadblues wrote:You might argue that ReadArgs should detect this broken syntax and autofix it, (I'll let ColinW comment on that, you never know what consequances such autofixing might have an another genuine combination) but you should have quoted your argument in the first place, that what quotes are for!.
I've had a discussion with Colin about the "=" substitution before and suggested that "=" should only be substituted with a space if it is preceded by a keyword in the template. However, I'm not sure if it would be wise to change argument parsing this late in the game; it's been the way it is for decades.
AmigaOne X1000 with 2GB memory - OS4.1 FE
Post Reply