Index
Introduction
When we execute a loop in a script, we process all iterations one by one, in other words, sequentially. Often, this is perfectly fine, but sometimes there is a desire or necessity to do it more in parallel. This post is about one method to achieve that. The method I want to discuss is Foreach-Object
, specifically with -Parallel
. Unfortunately, for those limited to Windows PowerShell 5.1 (or lower), this parameter is only available from PowerShell 7.0 onwards. An alternative could be to run it as a Job, which I will cover in a future blog post.
Syntax
Let’s first look at the parameters we can use. For this explanation, I will limit myself to the parameters that can be used alongside -Parallel
.
Parallel <Scriptblock
>
Here we place the code in the form of a
scriptblock
that we want to execute in parallel.
InputObject <PSObject
>
Provide the input here. If you use this parameter instead of piping the input, keep in mind that the collection is treated as a single object. Therefore, unless it has a specific purpose, I advise against using this parameter. In all examples, we will use piping.
ThrottleLimit <Integer
>
Limits the number of parallel tasks. Default: 5. If you use this parameter together with -AsJob, it won’t limit the jobs that can be created.
TimeoutSeconds <Integer
>
The number of seconds to wait until all input is processed. If the time elapses, all remaining tasks are aborted and ignored. The default value of 0 disables the timeout, allowing the loop to run indefinitely.
AsJob <Switch
>
Turns the
scriptblock
into a single job with x number of child jobs for the parallel tasks. The result is a job that we can further work with using all job-related cmdlets.
UseNewRunSpace <Switch
>
Causes the parallel invocation to create a new runspace for every loop iteration instead of reusing runspaces from the runspace pool.
WhatIf <Switch
>
Ensures that the cmdlet is not executed but shows what would have been used if it were executed.
Confirm <Switch
>
Requires confirmation for execution. Default:
False
Examples of Foreach-Object
Okay, we now have an idea of what we can use. Let’s put it into practice and see the effect in terms of time savings.
I’ll start by defining a collection. I will perform an action on the list of names below. I’m using a Generic List
, but any other form of an Array
can also be used here without any issues.
[System.Collections.Generic.List[String]]$Names = @(
"Alice",
"Bob",
"Charlie",
"David",
"Eve",
"Frank",
"Grace",
"Hannah"
)
And the action we are going to perform:
$Action = {
$name = $_
Write-Output "Hello, $name!"
Write-Output "Your name has $($name.Length) characters."
Start-Sleep -Milliseconds 500
}
Let’s combine the above and execute it without -Parallel
.
$Names | Foreach-Object $Action
Hello, Alice!
Your name has 5 characters.
Hello, Bob!
Your name has 3 characters.
Hello, Charlie!
Your name has 7 characters.
Hello, David!
Your name has 5 characters.
Hello, Eve!
Your name has 3 characters.
Hello, Frank!
Your name has 5 characters.
Hello, Grace!
Your name has 5 characters.
Hello, Hannah!
Your name has 6 characters.
And to have a baseline for runtime, let’s see how long this code takes to run.
Measure-Command { $Names | Foreach-Object $Action } | Select-Object TotalMilliseconds
TotalMilliseconds
-----------------
4059.93
The relatively long execution time is due to the Start-Sleep
, but with a small dataset, it’s necessary to add some extra time to see the difference between using -Parallel
and not.
Here’s a brief explanation of what happens: We write 2 lines of text to the output (for simplicity, let’s call it the screen). We wait 500 milliseconds and repeat these steps for all 8 names. 8 * 500 milliseconds = 4000 milliseconds, plus some overhead for setting up the runtime and writing the text to the screen.
Parallel
Now, we’ll execute the above code again, but this time with -Parallel
, using the default ThrottleLimit
of 5.
Measure-Command { $Names | Foreach-Object -Parallel $Action } | Select-Object TotalMilliseconds
TotalMilliseconds
-----------------
1067.83
That’s quite a significant difference. Since we are now running 5 tasks in parallel, instead of 8 * 500 milliseconds, we effectively have 2 * 500 milliseconds.
And attentive viewers will also notice that the overhead is now higher. This overhead is always somewhat variable and strongly depends on how many resources the PowerShell process can use and is allowed to use. But the large overhead, in this case, is because the system takes slightly more time to set up 5 parallel runtimes than for 1 runtime (sequential). This difference will be more favorable for -Parallel with scriptblocks that take more time. See the example below for pinging a subnet.
$PingIP = {
$ip = "192.168.1.$_"
$result = Test-Connection -ComputerName $ip -Count 1 -Quiet
if ($result) {
Write-Output "$ip is online"
} else {
Write-Output "$ip is offline"
}
}
(Measure-Command { 1..254 | ForEach-Object $PingIP }).TotalMilliseconds
697786.11
(Measure-Command { 1..254 | ForEach-Object -Parallel $PingIP }).TotalMilliseconds
140972.3888
For each IP address in the range of 192.168.1.0/24 (255 IPs), we check if we can ping it. In the sequential execution, we have to wait for the timeout period for an IP that is not online before testing the next IP. This results, in my test, in a total of 697786 milliseconds (698 seconds), more than 11.5 minutes.
If we execute this in parallel with 5 tasks at a time, the other threads can continue while one waits for the timeout period. This drastically reduces the time: 140972 milliseconds (141 seconds), just under 2.5 minutes. We have now executed the loop in 20.2% of the original (sequential) loop.
ThrottleLimit
So far, we have been working with the default ThrottleLimit
value of 5, but if you have a powerful machine with multiple cores, you might want to work with a higher limit. We can then apply the -ThrottleLimit
parameter.
## max 10 parallel tasks
$Names | Foreach-Object -Parallel $Action -ThrottleLimit 10
## or we base it on the available cores
$Names | Foreach-Object -Parallel $Action -ThrottleLimit $Env:NUMBER_OF_PROCESSORS
Using Variables in Parallel
Besides a difference in throughput, there is also another important difference between a regular Foreach-Object
and a Foreach-Object -Parallel
. If we want to use variables or functions that we have defined locally, things work a bit differently. Let’s look at some examples.
$FavoriteFruit = "Banana"
[System.Collections.Generic.List[String]]$Fruits = @(
"Apple",
"Banana",
"Cherry",
"Date",
"Fig",
"Grape"
)
$Fruits | Foreach-Object {
if ($_ -eq $FavoriteFruit) {
Write-Host "I like $_ the best!"
} else {
Write-Host "I like $_, but not as much as $FavoriteFruit."
}
}
I like Apple, but not as much as Banana.
I like Banana the best!
I like Cherry, but not as much as Banana.
I like Date, but not as much as Banana.
I like Fig, but not as much as Banana.
I like Grape, but not as much as Banana.
Okay, that’s what we expected. But now in parallel:
$Fruits | Foreach-Object -Parallel {
if ($_ -eq $FavoriteFruit) {
Write-Host "I like $_ the best!"
} else {
Write-Host "I like $_, but not as much as $FavoriteFruit."
}
}
I like Apple, but not as much as .
I like Cherry, but not as much as .
I like Banana, but not as much as .
I like Date, but not as much as .
I like Grape, but not as much as .
I like Fig, but not as much as .
Strange, or is it? The reason the if-statement doesn’t work correctly here and we don’t have the value of $FavoriteFruit
is actually quite logical. With Foreach-Object
, we execute the code in the scriptblock in our local session/runtime. But when we use -Parallel
, we create a separate runtime for each task. This runtime only has the information available within its scope, and in this case, that’s just the scriptblock. Read more about Scopes here.
How do we get this to work? We use the variable $FavoriteFruit
with $Using:
$Fruits | foreach-Object -Parallel {
if ($_ -eq $Using:FavoriteFruit) {
Write-Host "I like $_ the best!"
} else {
Write-Host "I like $_, but not as much as $Using:FavoriteFruit."
}
}
I like Banana the best!
I like Apple, but not as much as Banana.
I like Cherry, but not as much as Banana.
I like Date, but not as much as Banana.
I like Grape, but not as much as Banana.
I like Fig, but not as much as Banana.
And now we have the output we expected back. Another thing that stands out in this output is that the order is different from without -Parallel
. Since “Banana” is script-wise the shortest route and is in the first 5 (default of ThrottleLimit
), it is processed first. This is a good example of what we call ‘Cost’. The lower the cost of running the code, the faster and more efficient it is.
The above are examples of Variables, but the same happens with all other objects defined outside the scope. See Master Remote Sessions: Effortlessly Deploy Local Functions for an example of how to solve this.
Faster
For tasks that are more intensive or time-consuming, we now have a good solution. But, and I won’t be the only one, is this the maximum we can achieve? The answer is no. The difficulty level becomes too advanced for this post, so it seems appropriate to cover it in a separate one.
However, I will give a little hint. The solution to make everything faster involves the extra overhead generated when creating separate runtimes. Foreach-Object -Parallel
creates a separate runtime for each task, runs the code, and destroys the runtime. The first and last steps are the extra overhead. What if we could eliminate that?
In a post next week, I will delve deeper into this and explore advanced parallel task execution with you.
Summary
With Foreach-Object -Parallel
, we can execute (time)intensive tasks faster. For short tasks, it is more efficient to execute them sequentially due to the extra overhead of creating separate runspaces. When executing the loop in parallel, we must consider that these are separate runspaces and do not have access to variables or other objects defined outside that scope.
Leave a Reply