Thursday, 28 March 2013

Using Powershell to download multiple files

Powershell


Powershell is a shell scripting language with command line access, based on .NET, which  gives
the user and the developer access to an abundance of system resources, plus intra- and internetworked resources. In this article, an example of how Powershell can be used to download multiple files from the internet, and save these files to a local file folder, will be given.

Powershell is as previously mentioned based upon .NET. This gives Powershell many capabilities as a command line shell and scripting language. Syntaxwise, Powershell in many respects are similar to Perl or PHP, but there are also simliaries to other .NET Languages. At the same time, this is a command line shell script. It's role is to inherit the role that the Command Line Prompt (CMD.exe) and MS-DOS previously covered. Instead of the .BAT files of CMD and  MS-DOS, scripts are written in Powershell scripts, with the extension .PS1. There are more advanced topics to cover, such as modules of Powershell, but this article will cover some basic use of Powershell.

I have used the Windows Powershell ISE, which is the Integrated Script Environment.
The term ISE is similar to IDE, which stands for Integrated Development Environment. Powershell ISE is included in Windows 8, which I have used. Windows 7 users can download the Windows Management Framework 3.0 package, which is available at the following URL:

http://www.microsoft.com/en-us/download/details.aspx?id=34595


The above package is available also for Windows 2008 Server R2 and Windows 2012 Server.

Downloading files with Powershell

Let's first review the script which downloads some files (in my example I create an Object array of Url addresses which points to Flickr thumbnail images of mine).

            
$flickrImages = 'http://farm9.static.flickr.com/8370/8595533065_4f19f05869_m.jpg',             
'http://farm9.static.flickr.com/8239/8566909448_cac6a5ea75_m.jpg',             
'http://farm9.static.flickr.com/8375/8566909032_70866e8ce7_m.jpg',             
'http://farm9.static.flickr.com/8520/8566908388_2d0bcdc572_m.jpg',             
'http://farm9.static.flickr.com/8049/8131277333_83586d9afc_m.jpg',             
'http://farm9.static.flickr.com/8291/7605276718_a7e8c2d7d8_m.jpg',             
'http://farm6.static.flickr.com/5321/7407130682_c4c26ca2d9_m.jpg',             
'http://farm8.static.flickr.com/7089/7379569808_8e0311918c_m.jpg',             
'http://farm8.static.flickr.com/7088/7378798094_ac05bedf72_m.jpg',             
'http://farm8.static.flickr.com/7245/7217153808_f4e5125b4e_m.jpg',             
'http://farm8.static.flickr.com/7243/7217105052_6b1fc818f8_m.jpg',             
'http://farm9.static.flickr.com/8143/7191404308_47bb667be6_m.jpg',             
'http://farm8.static.flickr.com/7089/7149219081_bfe587bd34_m.jpg',             
'http://farm8.static.flickr.com/7260/7145533071_c7241d0064_m.jpg',             
'http://farm8.static.flickr.com/7048/6999442724_e399e63e68_m.jpg',             
'http://farm9.static.flickr.com/8158/6986866246_d04ec09342_m.jpg',             
'http://farm8.static.flickr.com/7124/7057325243_db81f4c65a_m.jpg',             
'http://farm8.static.flickr.com/7089/6911179760_24654b32fd_m.jpg',             
'http://farm8.static.flickr.com/7268/6910639934_31cd36a854_m.jpg',             
'http://farm8.static.flickr.com/7274/7052136425_6be954f436_m.jpg',             
'http://farm8.static.flickr.com/7037/7052113599_2a249aa763_m.jpg',             
'http://farm8.static.flickr.com/7072/6906010348_1fd6513b31_m.jpg',             
'http://farm8.static.flickr.com/7089/6902820096_95a7ec11c4_m.jpg',             
'http://farm8.static.flickr.com/7280/7048911257_89ae08a75e_m.jpg',             
'http://farm6.static.flickr.com/5467/7048911105_7370eff5ef_m.jpg',            
'http://farm6.static.flickr.com/5031/6902819694_763fc65fe0_m.jpg',             
'http://farm6.static.flickr.com/5234/6902819562_0f78bc56f4_m.jpg',             
'http://farm8.static.flickr.com/7279/7048910545_f02faeda37_m.jpg',             
'http://farm8.static.flickr.com/7080/7047789289_c93ff1deac_m.jpg',             
'http://farm8.static.flickr.com/7220/7047688059_216d67e9d3_m.jpg',             
'http://farm8.static.flickr.com/7092/6901520442_637c138c1f_m.jpg'            
            
$targetDir = 'C:\users\Tore Aurstad\Documents\Powershell Scripts\WebClient\'            
            
            
function DownloadFile([Object[]] $sourceFiles,[string]$targetDirectory) {            
 $wc = New-Object System.Net.WebClient            
             
 foreach ($sourceFile in $sourceFiles){            
  $sourceFileName = $sourceFile.SubString($sourceFile.LastIndexOf('/')+1)            
  $targetFileName = $targetDirectory + $sourceFileName            
  $wc.DownloadFile($sourceFile, $targetFileName)            
  Write-Host "Downloaded $sourceFile to file location $targetFileName"             
 }            
            
}            
            
DownloadFile $flickrImages $targetDir            

Many .NET developers are unfamiliar with Powershell scripting language syntax. Variables are declared by prefixing the variable name with the $-sign. I declare an Object Array, which is basically a comma separated list. This syntax will be familiar to Perl Developers. Also note the absence of semi-colons. Semi-colons are not obligatory in Powershell, and these are therefore omitted.

Further on, I declare a variable for where to put the downloaded files. If you want to test out this script yourself, you have to change this path, obviously.
Next on, I declare a function in Powershell for downloading the files. This function takes in an Object array and a string for the target Directory or folder.

The function is called without using commas or parentheses. In Powershell every function argument is passed in using spaces. This is where the Powershell syntax feels a bit different to other Microsoft Languages. If you use commas when calling the function in this example, the first argument will receive an Object Array, while the second argument will be empty. Be aware of this Powershell gotcha - use spaces when calling a function with multiple arguments.

Let's investigate the function a bit closer. As you can see, although Powershell allows mutable types (which means that a variable containing for example a string can be redefined to an integer), prefixing the arguments with square brackets and a type, such as [Object[]] or [string], makes your Powershell scripts more type-safe. Powershell obviously in many cases feels like Javascript for example, where also a variable can be changed into another type (redefined). If you want to have more type safety, prefix your function arguments with the strongly type you expect being passed into the Powershell function. The variable $wc is instantiated into a new System.Net.WebClient instance, using the New-Object cmdlet (Command let) in Powershell. To perform the download of each file in the passed in Object Array, I use a foreach loop.

This is where Powershell shows itself as a convenient scripting language. Iterating through a collection using foreach loop makes coding much easier, note though that Perl already implemented this functionality some years ago. Inside the foreach loop, the call to the function or method $wc.DownloadFile downloads the file from the Internet using the instantiated WebClient. I am not sure if .BAT files could achieve this, but this shows how much functionality is available to Powershell users and Developers. Powershell is well suited for system maintenance and administration, plus actually are larger part of target uses than one might think of a shell script language and command line tool.

The fact that much of .NET is available means that .NET Developers in some cases must rethink their use of the relationship between compiled programs and scripts, now that Powershell is so available. The added convenience and flexibility of Powershell, plus functionality makes Powershell a contender among the more mature .NET Programming Languages such as VB and C#. If you implemement something in a Programming language, think of doing the same in a script. Sometimes going for a script based solution will be better, more flexible and optimal.

For developers using .NET, the use of SubString and LastIndexOf shows that many .NET methods and types are available for us in Powershell.

If you wonder how I formatted the Powershell code above, I downloaded the PowerShellPack utility from here:

Powershell pack

After you have downloaded PowershellPack and installed it, follow the following guide:

Copy selected text in Powershell ISE to colored Html


Click the thumbnail below to see the script running inside Windows Powershell ISE:




Windows Powershell ISE has an editor and a command line below, where Powershell scripts can be executed and Powershell itself can be used. To start the PowerShell script, I type .\WebClient.ps1, since I saved the script above in a file called WebClient.ps1.

Users of BASH and other shell scripting Languages will feel familiar using Powershell, as multiple commands are linked to Unix familiar commands, such as DIR in MS-DOS and CMD is aliased to the command ls. More commands can be aliased, such that Unix Developers can manage part of Windows Clients and servers and still have that familiar Unix syntax feel of it ... This concludes our introductory tour of Powershell. I would like to say happy coding, but it is perhaps more correct to say happy (Powershell) scripting?

3 comments:

  1. If you are looking for some good Powershell resources, check out Pluralsight website. Pluralsight got many hours of videos covering introductory, intermediate and expert use of Powershell scripting.

    ReplyDelete
  2. Hi. Im trying to use your code to download multiple files using a MySQL database as URL-Source. The files must be downloaded simultanously and every X Seconds. Could you point me in the right direction With this? thank you :)

    ReplyDelete
  3. If you want to download simultaneously, use the DownloadAsync method of the web client in the sample above. You are also interested in instantating a timer to run the script every X second.

    check out: http://stackoverflow.com/questions/4926060/powershell-runspace-problem-with-downloadfileasync
    https://www.cogmotive.com/blog/powershell/querying-mysql-from-powershell
    https://www.cogmotive.com/blog/powershell/querying-mysql-from-powershell

    Register-ObjectEvent cmdlet is probably worth checking out

    ReplyDelete