How do I remove non-ASCII characters from filenames?

I believe this will work...

$Files = gci | where {$_.Name -match "[^\u0020-\u007F]"}

$Files | ForEach-Object {
$OldName = $_.Name
$NewName = $OldName -replace "[^\u0020-\u007F]", "_"
ren $_ $NewName
}

I don't have that range of ASCII filenames to test against though.


I found a similar topic here on Stack Overflow.

With the following code most of the characters will be translated to their "closest character". Although i couldn't get the translated. (Maybe it does, i can't make a filename in the prompt with it ;) The ß also does not get translated.

function Remove-Diacritics {
param ([String]$src = [String]::Empty)
  $normalized = $src.Normalize( [Text.NormalizationForm]::FormD )
  $sb = new-object Text.StringBuilder
  $normalized.ToCharArray() | % {
    if( [Globalization.CharUnicodeInfo]::GetUnicodeCategory($_) -ne [Globalization.UnicodeCategory]::NonSpacingMark) {
      [void]$sb.Append($_)
    }
  }
  $sb.ToString()
}

$files = gci -recurse | where {$_.Name -match "[^\u0020-\u007F]"}
$files | ForEach-Object {
  $newname = Remove-Diacritics $_.Name
  if ($_.Name -ne $newname) {
    $num=1
    $nextname = $_.Fullname.replace($_.Name,$newname)
    while(Test-Path -Path $nextname)
    {
      $next = ([io.fileinfo]$newname).basename + " ($num)" + ([io.fileinfo]$newname).Extension
      $nextname = $_.Fullname.replace($_.Name,$next)
      $num+=1
    }
    echo $nextname
    ren $_.Fullname $nextname
  }
}

Edit:

I added some code to check if a filename already exists and add (1), (2) etc... if it does. (It's not smart enough to detect an already existing (1) in the filename to be renamed so in that case you would get (1) (1). But as always... everything is programmable ;)

Edit 2:

Here is the last one for tonight...

This one has a different function for replacing the characters. Also added a line to change unknown characters like ß and for example to _.

function Convert-ToLatinCharacters {
param([string]$inputString)
  [Text.Encoding]::ASCII.GetString([Text.Encoding]::GetEncoding("Cyrillic").GetBytes($inputString))
}

$files = gci -recurse | where {$_.Name -match "[^\u0020-\u007F]"}
$files | ForEach-Object {
  $newname = Convert-ToLatinCharacters $_.Name
  $newname = $newname.replace('?','_')
  if ($_.Name -ne $newname) {
    $num=1
    $nextname = $_.Fullname.replace($_.Name,$newname)
    while(Test-Path -Path $nextname)
    {
      $next = ([io.fileinfo]$newname).basename + " ($num)" + ([io.fileinfo]$newname).Extension
      $nextname = $_.Fullname.replace($_.Name,$next)
      $num+=1
    }
    echo $nextname
    ren $_.Fullname $nextname
  }
}