Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Simple FTP support in UrlDownloadToFile


  • Please log in to reply
4 replies to this topic
Lexikos
  • Administrators
  • 9391 posts
  • Last active:
  • Joined: 17 Oct 2006
Since the help file doesn't specify what types of URL are supported, one might assume it supports protocols other than HTTP. However, if UrlDownloadToFile is given an FTP URL, it fails at InternetReadFileEx and GetLastError returns 12018 - "The type of handle supplied is incorrect for this operation." MSDN isn't very clear about the limitations of InternetReadFileEx, but compare its description to InternetReadFile:

Reads data from a handle opened by the InternetOpenUrl or HttpOpenRequest function.

Reads data from a handle opened by the InternetOpenUrl, FtpOpenFile, GopherOpenFile, or HttpOpenRequest function.

I made a few quick changes to confirm that InternetReadFile would allow UrlDownloadToFile to work with FTP:
ResultType Line::URLDownloadToFile(char *aURL, char *aFilespec)
{
	[color=green]// Check that we have IE3 and access to wininet.dll[/color]
	HINSTANCE hinstLib = LoadLibrary("wininet");
	if (!hinstLib)
		return g_ErrorLevel->Assign(ERRORLEVEL_ERROR);

	typedef HINTERNET (WINAPI *MyInternetOpen)(LPCTSTR, DWORD, LPCTSTR, LPCTSTR, DWORD dwFlags);
	typedef HINTERNET (WINAPI *MyInternetOpenUrl)(HINTERNET hInternet, LPCTSTR, LPCTSTR, DWORD, DWORD, LPDWORD);
	typedef BOOL (WINAPI *MyInternetCloseHandle)(HINTERNET);
	typedef BOOL (WINAPI *MyInternetReadFileEx)(HINTERNET, LPINTERNET_BUFFERS, DWORD, DWORD);
[color=darkred]	typedef BOOL (WINAPI *MyInternetReadFile)(HINTERNET, LPVOID, DWORD, LPDWORD); [/color]

	#ifndef INTERNET_OPEN_TYPE_PRECONFIG_WITH_NO_AUTOPROXY
		#define INTERNET_OPEN_TYPE_PRECONFIG_WITH_NO_AUTOPROXY 4
	#endif

	[color=green]// Get the address of all the functions we require.  It's done this way in case the system[/color]
	[color=green]// lacks MSIE v3.0+, in which case the app would probably refuse to launch at all:[/color]
 	MyInternetOpen lpfnInternetOpen = (MyInternetOpen)GetProcAddress(hinstLib, "InternetOpenA");
	MyInternetOpenUrl lpfnInternetOpenUrl = (MyInternetOpenUrl)GetProcAddress(hinstLib, "InternetOpenUrlA");
	MyInternetCloseHandle lpfnInternetCloseHandle = (MyInternetCloseHandle)GetProcAddress(hinstLib, "InternetCloseHandle");
	MyInternetReadFileEx lpfnInternetReadFileEx = (MyInternetReadFileEx)GetProcAddress(hinstLib, "InternetReadFileExA");
[color=darkred]	MyInternetReadFile lpfnInternetReadFile = (MyInternetReadFile)GetProcAddress(hinstLib, "InternetReadFile"); [/color]
	if (!(lpfnInternetOpen && lpfnInternetOpenUrl && lpfnInternetCloseHandle && lpfnInternetReadFileEx [color=darkred]&& lpfnInternetReadFile[/color]))
		return g_ErrorLevel->Assign(ERRORLEVEL_ERROR);

	[color=green]// v1.0.44.07: Set default to INTERNET_FLAG_RELOAD vs. 0 because the vast majority of usages would want[/color]
	[color=green]// the file to be retrieved directly rather than from the cache.[/color]
	[color=green]// v1.0.46.04: Added more no-cache flags because otherwise, it definitely falls back to the cache if[/color]
	[color=green]// the remote server doesn't repond (and perhaps other errors), which defeats the ability to use[/color]
	[color=green]// UrlDownloadToFile for uptime/server monitoring.  Also, in spite of what MSDN says, it seems nearly[/color]
	[color=green]// certain based on other sources that more than one flag is supported.  Someone also mentioned that[/color]
	[color=green]// INTERNET_FLAG_CACHE_IF_NET_FAIL is related to this, but there's no way to specify it in these[/color]
	[color=green]// particular calls, and it's the opposite of the desired behavior anyway; so it seems impossible to[/color]
	[color=green]// turn it off explicitly.[/color]
	DWORD flags_for_open_url = INTERNET_FLAG_RELOAD | INTERNET_FLAG_NO_CACHE_WRITE;
[color=darkred]	bool is_http = *aURL == 'h'; [/color]
	aURL = omit_leading_whitespace(aURL);
	if (*aURL == '*') [color=green]// v1.0.44.07: Provide an option to override flags_for_open_url.[/color]
	{
		flags_for_open_url = ATOU(++aURL);
		char *cp;
		if (cp = StrChrAny(aURL, " \t")) [color=green]// Find first space or tab.[/color]
			aURL = omit_leading_whitespace(cp);
	}

	[color=green]// Open the internet session. v1.0.45.03: Provide a non-NULL user-agent because  some servers reject[/color]
	[color=green]// requests that lack a user-agent.  Furthermore, it's more professional to have one, in which case it[/color]
	[color=green]// should probably be kept as simple and unchanging as possible.  Using something like the script's name[/color]
	[color=green]// as the user agent (even if documented) seems like a bad idea because it might contain personal/sensitive info.[/color]
	HINTERNET hInet = lpfnInternetOpen("AutoHotkey", INTERNET_OPEN_TYPE_PRECONFIG_WITH_NO_AUTOPROXY, NULL, NULL, 0);
	if (!hInet)
	{
		FreeLibrary(hinstLib);
		return g_ErrorLevel->Assign(ERRORLEVEL_ERROR);
	}

	[color=green]// Open the required URL[/color]
	HINTERNET hFile = lpfnInternetOpenUrl(hInet, aURL, NULL, 0, flags_for_open_url, 0);
	if (!hFile)
	{
		lpfnInternetCloseHandle(hInet);
		FreeLibrary(hinstLib);
		return g_ErrorLevel->Assign(ERRORLEVEL_ERROR);
	}

	[color=green]// Open our output file[/color]
	FILE *fptr = fopen(aFilespec, "wb");	[color=green]// Open in binary write/destroy mode[/color]
	if (!fptr)
	{
		lpfnInternetCloseHandle(hFile);
		lpfnInternetCloseHandle(hInet);
		FreeLibrary(hinstLib);
		return g_ErrorLevel->Assign(ERRORLEVEL_ERROR);
	}

	BYTE bufData[1024 * 1]; [color=green]// v1.0.44.11: Reduced from 8 KB to alleviate GUI window lag during UrlDownloadtoFile.  Testing shows this reduction doesn't affect performance on high-speed downloads (in fact, downloads are slightly faster; I tested two sites, one at 184 KB/s and the other at 380 KB/s).  It might affect slow downloads, but that seems less likely so wasn't tested.[/color]
	INTERNET_BUFFERS buffers = {0};
	buffers.dwStructSize = sizeof(INTERNET_BUFFERS);
	buffers.lpvBuffer = bufData;
	buffers.dwBufferLength = sizeof(bufData);

	LONG_OPERATION_INIT

	[color=green]// Read the file.  I don't think synchronous transfers typically generate the pseudo-error[/color]
	[color=green]// ERROR_IO_PENDING, so that is not checked here.  That's probably just for async transfers.[/color]
	[color=green]// IRF_NO_WAIT is used to avoid requiring the call to block until the buffer is full.  By[/color]
	[color=green]// having it return the moment there is any data in the buffer, the program is made more[/color]
	[color=green]// responsive, especially when the download is very slow and/or one of the hooks is installed:[/color]
	BOOL result;
	[color=darkred]if (is_http)
	{[/color]
		while (result = lpfnInternetReadFileEx(hFile, &buffers, IRF_NO_WAIT, NULL)) [color=green]// Assign[/color]
		{
			if (!buffers.dwBufferLength) [color=green]// Transfer is complete.[/color]
				break;
			LONG_OPERATION_UPDATE  [color=green]// Done in between the net-read and the file-write to improve avg. responsiveness.[/color]
			fwrite(bufData, buffers.dwBufferLength, 1, fptr);
			buffers.dwBufferLength = sizeof(bufData);  [color=green]// Reset buffer capacity for next iteration.[/color]
		}
	[color=darkred]}
	else
	{
		DWORD number_of_bytes_read;
		while (result = lpfnInternetReadFile(hFile, bufData, sizeof(bufData), &number_of_bytes_read))
		{
			if (!number_of_bytes_read)
				break;
			LONG_OPERATION_UPDATE
			fwrite(bufData, number_of_bytes_read, 1, fptr);
		}
	}[/color]

	[color=green]// Close internet session:[/color]
	lpfnInternetCloseHandle(hFile);
	lpfnInternetCloseHandle(hInet);
	FreeLibrary(hinstLib); [color=green]// Only after the above.[/color]
	[color=green]// Close output file:[/color]
	fclose(fptr);

	if (result)
		return g_ErrorLevel->Assign(ERRORLEVEL_NONE);  [color=green]// Indicate success.[/color]
	else [color=green]// An error occurred during the transfer.[/color]
	{
		DeleteFile(aFilespec);  [color=green]// delete damaged/incomplete file[/color]
		return g_ErrorLevel->Assign(ERRORLEVEL_ERROR);
	}
}
Since the comments imply InternetReadFileEx may allow the script to be more responsive while downloading, the code above uses it (only) if the URL begins with 'h'. (MSDN says "Only URLs beginning with ftp:, gopher:, http:, or https: are supported.")

I tested with the URL form ftp://username:password@host:port/path and successfully retrieved a text file and FTP directory listing in HTML form.

I also removed the presumably redundant "| INTERNET_FLAG_NO_CACHE_WRITE" from:
DWORD flags_for_open_url = INTERNET_FLAG_RELOAD | INTERNET_FLAG_NO_CACHE_WRITE | INTERNET_FLAG_NO_CACHE_WRITE;

[Edit @ 2009-07-29: At some point (maybe the beginning) this thread became an announcement. I'm not sure how that happened; it certainly wasn't intentional. I've corrected it.]

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
I did it two years ago:
<!-- m -->http://www.autohotke...topic19608.html<!-- m -->

I'm a bit curious why InternetReadFileEx doesn't work with FTP. Is it related with the flag IRF_NO_WAIT?

Lexikos
  • Administrators
  • 9391 posts
  • Last active:
  • Joined: 17 Oct 2006
It doesn't work with or without IRF_NO_WAIT. Without, there doesn't seem to be any reason to use InternetReadFileEx rather than InternetReadFile. I haven't been able to find any concrete information about InternetReadFileEx vs FTP. Interestingly, MSDN suggests InternetReadFileEx supports FTP on Windows CE:

This function reads data from a handle opened by the InternetOpenUrl, FtpOpenFile, or HttpOpenRequest function.
Source: MSDN: InternetReadFileEx (Windows CE)



Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
It looks like InternetReadFileEx was slightly a misnomer, or more likely was not implemented fully as planned, although I don't really know what their original plan was.

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
Sorry for the late reply.

Thanks for this useful discovery and your code and testing. This will be included in the next release.

I moved the following logic lower to support the possibility of a *0 option/prefix:
bool is_http = *aURL == 'h';