PowerShell stripping double quotes from command line arguments

It is a known thing:

It's FAR TOO HARD to pass parameters to applications which require quoted strings. I asked this question in IRC with a "roomful" of PowerShell experts, and it took hour for someone to figure out a way (I originally started to post here that it is simply not possible). This completely breaks PowerShell's ability to serve as a general purpose shell, because we can't do simple things like executing sqlcmd. The number one job of a command shell should be running command-line applications... As an example, trying to use SqlCmd from SQL Server 2008, there is a -v parameter which takes a series of name:value parameters. If the value has spaces in it, you must quote it...

...there is no single way to write a command line to invoke this application correctly, so even after you master all 4 or 5 different ways of quoting and escaping things, you're still guessing as to which will work when ... or, you can just shell out to cmd, and be done with it.


TL;DR

If you just want a solution for Powershell 5, see:

ConvertTo-ArgvQuoteForPoSh.ps: Powershell V5 (and C# Code) to allow escaping native command arguments

The Question I will try to answer

..., it appears PowerShell is stripping double quotes from command line arguments, even when properly escaped.

PS C:\Documents and Settings\Nick> echo.exe '"hello"'
hello 
PS C:\Documents and Settings\Nick> echo.exe '\"hello\"' 
"hello"

Notice that the double quotes are there when passed to PowerShell's echo cmdlet, but when passed as an argument to echo.exe, the double quotes are stripped unless escaped with a backslash (even though PowerShell's escape character is a backtick, not a backslash).

This seems like a bug to me. If I am passing the correct escaped strings to PowerShell, then PowerShell should take care of whatever escaping may be necessary for however it invokes the command.

What is going on here?

The Non-Powershell Background

The fact that you need to escape the quotes with backslashes \ has nothing to to with powershell, but with the CommandLineToArgvW function that is used by all msvcrt and C# programs to build the argv array from the single-string command line that the Windows process gets passed.

The details are explained at Everyone quotes command line arguments the wrong way and it basically boils down to the fact that this function historically has very uninutitive escaping rules:

  • 2n backslashes followed by a quotation mark produce n backslashes followed by begin/end quote. This does not become part of the parsed argument, but toggles the "in quotes" mode.
  • (2n) + 1 backslashes followed by a quotation mark again produce n backslashes followed by a quotation mark literal ("). This does not toggle the "in quotes" mode.
  • n backslashes not followed by a quotation mark simply produce n backslashes.

leading to the described generic escaping function (shortquote of the logic here):

CommandLine.push_back (L'"');

for (auto It = Argument.begin () ; ; ++It) {
      unsigned NumberBackslashes = 0;

      while (It != Argument.end () && *It == L'\\') {
              ++It;
              ++NumberBackslashes;
      }

      if (It == Argument.end ()) {
              // Escape all backslashes, but let the terminating
              // double quotation mark we add below be interpreted
              // as a metacharacter.
              CommandLine.append (NumberBackslashes * 2, L'\\');
              break;
      } else if (*It == L'"') {
              // Escape all backslashes and the following
              // double quotation mark.
              CommandLine.append (NumberBackslashes * 2 + 1, L'\\');
              CommandLine.push_back (*It);
      } else {
              // Backslashes aren't special here.
              CommandLine.append (NumberBackslashes, L'\\');
              CommandLine.push_back (*It);
      }
}

CommandLine.push_back (L'"');

The Powershell specifics

Now, up to Powershell 5 (including PoSh 5.1.18362.145 on Win10/1909) PoSh knows basically diddly about these rules, nor should it arguably, because these rules are not really general, because any executable you call could, in theory, use some other means to interpret the passed command line.

Which leads us to -

The Powershell Quoting Rules

What PoSh does do however is try to figure out whether the strings you pass it as arguments to the native commands need to be quoted because they contain whitespace.

PoSh - in contrast to cmd.exe - does a lot more parsing on the command you hand it, since it has to resolve variables and knows about multiple arguments.

So, given a command like

$firs  = 'whaddyaknow'
$secnd = 'it may have spaces'
$third = 'it may also have "quotes" and other \" weird \\ stuff'
EchoArgs.exe $firs $secnd $third

Powershell has to take a stance on how to create the single string CommandLine for the Win32 CreateProcess (or rather the C# Process.Start) call it will evetually have to do.

The approach Powershell takes is weird and got more complicated in PoSh V7 , and as far as I can follow, it's got to do how powershell treats unbalanced quotes in unquoted string. The long stories short is this:

Powershell will auto-quote (enclose in <">) a single argument string, if it contains spaces and the spaces don't mix with an uneven number of (unsescaped) double quotes.

The specific quoting rules of PoSh V5 make it impossible to pass a certain category of string as single argument to a child process.

PoSh V7 fixed this, so that as long as all quotes are \" escaped -- which they need to be anyway to get them through CommandLineToArgvW -- we can pass any aribtrary string from PoSh to a child executable that uses CommandLineToArgvW.

Here's the rules as C# code as extracted from the PoSh github repo for a tool class of ours:

PoSh Quoting Rules V5

    public static bool NeedQuotesPoshV5(string arg)
    {
        // bool needQuotes = false;
        int quoteCount = 0;
        for (int i = 0; i < arg.Length; i++)
        {
            if (arg[i] == '"')
            {
                quoteCount += 1;
            }
            else if (char.IsWhiteSpace(arg[i]) && (quoteCount % 2 == 0))
            {
                // needQuotes = true;
                return true;
            }
        }
        return false;
    }

PoSh Quoting Rules V7

    internal static bool NeedQuotesPoshV7(string arg)
    {
        bool followingBackslash = false;
        // bool needQuotes = false;
        int quoteCount = 0;
        for (int i = 0; i < arg.Length; i++)
        {
            if (arg[i] == '"' && !followingBackslash)
            {
                quoteCount += 1;
            }
            else if (char.IsWhiteSpace(arg[i]) && (quoteCount % 2 == 0))
            {
                // needQuotes = true;
                return true;
            }

            followingBackslash = arg[i] == '\\';
        }
        // return needQuotes;
        return false;
    }

Oh yeah, and they also added in a half baked attempt to correctly escape the and of the quoted string in V7:

if (NeedQuotes(arg))
{
      _arguments.Append('"');
      // need to escape all trailing backslashes so the native command receives it correctly
      // according to http://www.daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULESDOC
      _arguments.Append(arg);
      for (int i = arg.Length - 1; i >= 0 && arg[i] == '\\'; i--)
      {
              _arguments.Append('\\');
      }

      _arguments.Append('"');

The Powershell Situation

Input to EchoArgs             | Output V5 (powershell.exe)  | Output V7 (pwsh.exe)
===================================================================================
EchoArgs.exe 'abc def'        | Arg 0 is <abc def>          | Arg 0 is <abc def>
------------------------------|-----------------------------|---------------------------
EchoArgs.exe '\"nospace\"'    | Arg 0 is <"nospace">        | Arg 0 is <"nospace">
------------------------------|-----------------------------|---------------------------
EchoArgs.exe '"\"nospace\""'  | Arg 0 is <"nospace">        | Arg 0 is <"nospace">
------------------------------|-----------------------------|---------------------------
EchoArgs.exe 'a\"bc def'      | Arg 0 is <a"bc>             | Arg 0 is <a"bc def>
                              | Arg 1 is <def>              |
------------------------------|-----------------------------|---------------------------
   ...

I'm snipping further examples here for time reasons. They shouldn't add overmuch to the answer anyways.

The Powershell Solution

To pass arbitrary Strings from Powershell to a native command using CommandLineToArgvW, we have to:

  • properly escape all quotes and Backslashes in the source argument
    • This means recognizing the special string-end handling for backslashes that V7 has. (This part is not implemented in the code below.)
  • and determine whether powershell will auto-quote our escaped string and if it won't auto-quote it, quote it ourselves.
    • and make sure that the string we quoted ourselves then doesn't get auto-quoted by powershell: This is what breaks V5.

Powershell V5 Source code for correctly escaping all arguments to any native command

I've put the full code on Gist, as it got too long to include here: ConvertTo-ArgvQuoteForPoSh.ps: Powershell V5 (and C# Code) to allow escaping native command arguments

  • Note that this code tries it's best, but for some strings with quotes in the payload and V5 you simply must add in leading space to the arguments you pass. (See code for logic details).

I personally avoid using '\' to escape things in PowerShell, because it's not technically a shell escape character. I've gotten unpredictable results with it. In double-quoted strings, you can use "" to get an embedded double-quote, or escape it with a back-tick:

PS C:\Users\Droj> "string ""with`" quotes"
string "with" quotes

The same goes for single quotes:

PS C:\Users\Droj> 'string ''with'' quotes'
string 'with' quotes

The weird thing about sending parameters to external programs is that there is additional level of quote evaluation. I don't know if this is a bug, but I'm guessing it won't be changed, because the behavior is the same when you use Start-Process and pass in arguments. Start-Process takes an array for the arguments, which makes things a bit clearer, in terms of how many arguments are actually being sent, but those arguments seem to be evaluated an extra time.

So, if I have an array, I can set the argument values to have embedded quotes:

PS C:\cygwin\home\Droj> $aa = 'arg="foo"', 'arg=""""bar""""'
PS C:\cygwin\home\Droj> echo $aa
arg="foo"
arg=""""bar""""

The 'bar' argument has enough to cover the extra hidden evaluation. It's as if I send that value to a cmdlet in double-quotes, then send that result again in double-quotes:

PS C:\cygwin\home\Droj> echo "arg=""""bar""""" # level one
arg=""bar""
PS C:\cygwin\home\Droj> echo "arg=""bar""" # hidden level
arg="bar"

One would expect these arguments to be passed to external commands as-is, as they are to cmdlets like 'echo'/'write-output', but they are not, because of that hidden level:

PS C:\cygwin\home\Droj> $aa = 'arg="foo"', 'arg=""""bar""""'
PS C:\cygwin\home\Droj> start c:\cygwin\bin\echo $aa -nonew -wait
arg=foo arg="bar"

I don't know the exact reason for it, but the behavior is as if there is another, undocumented step being taken under the covers that re-parses the strings. For example, I get the same result if I send the array to a cmdlet, but add a parsing level by doing it through invoke-expression:

PS C:\cygwin\home\Droj> $aa = 'arg="foo"', 'arg=""""bar""""'
PS C:\cygwin\home\Droj> iex "echo $aa"
arg=foo
arg="bar"

...which is exactly what I get when I send these arguments to my external Cygwin instance's 'echo.exe':

PS C:\cygwin\home\Droj> c:\cygwin\bin\echo 'arg="foo"' 'arg=""""bar""""'
arg=foo arg="bar"