How to split large text file in windows?

Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
    Set rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append "LineNumber", 4 

        .Fields.Append "Txt", 201, 5000 
        .Open
        LineCount = 0
        Do Until Inp.AtEndOfStream
            LineCount = LineCount + 1
            .AddNew
            .Fields("LineNumber").value = LineCount
            .Fields("Txt").value = Inp.readline
            .UpDate
        Loop

        .Sort = "LineNumber ASC"

        If LCase(Arg(1)) = "t" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber < " & LCase(Arg(3)) + 1
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber > " & LCase(Arg(3))
            End If
        ElseIf LCase(Arg(1)) = "b" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber > " & LineCount - LCase(Arg(3))
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
            End If
        End If

        Do While not .EOF
            Outp.writeline .Fields("Txt").Value

            .MoveNext
        Loop
    End With

Cut

filter cut {t|b} {i|x} NumOfLines

Cuts the number of lines from the top or bottom of file.

t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines

Example

cscript /nologo filter.vbs cut t i 5 < "%systemroot%\win.ini"

Another way This outputs lines 5001+, adapt for your use. This uses almost no memory.

Do Until Inp.AtEndOfStream
         Count = Count + 1
         If count > 5000 then
            OutP.WriteLine Inp.Readline
         End If
Loop

If you have installed Git for Windows, you should have Git Bash installed, since that comes with Git.

Use the split command in Git Bash to split a file:

  • into files of size 500MB each: split myLargeFile.txt -b 500m

  • into files with 10000 lines each: split myLargeFile.txt -l 10000

Tips:

  • If you don't have Git/Git Bash, download at https://git-scm.com/download

  • If you lost the shortcut to Git Bash, you can run it using C:\Program Files\Git\git-bash.exe

That's it!


I always like examples though...

Example:

enter image description here

You can see in this image that the files generated by split are named xaa, xab, xac, etc.

These names are made up of a prefix and a suffix, which you can specify. Since I didn't specify what I want the prefix or suffix to look like, the prefix defaulted to x, and the suffix defaulted to a two-character alphabetical enumeration.

Another Example:

This example demonstrates

  • using a filename prefix of MySlice (instead of the default x),
  • the -d flag for using numerical suffixes (instead of aa, ab, ac, etc...),
  • and the option -a 5 to tell it I want the suffixes to be 5 digits long:

enter image description here


Below code split file every 500

@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=upload.txt
REM Edit this value to change the number of lines per file.
SET LPF=15000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile

REM Do not change beyond this line.

SET SFX=%BFN:~-3%

SET /A LineNum=0
SET /A FileNum=1

For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1

echo %%l >> %SFN%!FileNum!.%SFX%

if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)

)
endlocal
Pause

See below: https://forums.techguy.org/threads/solved-split-a-100000-line-csv-into-5000-line-csv-files-with-dos-batch.1023949/