Suitable commands to replace a set of text

In MS Word, press CTRL+H:

  • Find what: (*),*\(*\)(*^13)(*^13)*^13^13
  • Replace with: \1^13\3^13^13
  • Select "Use wildcards" among the Search Options.


  • (*),*\(*\)(*^13)(*^13)*^13^13: searches for the Word, the Part of Speech, etc., then finds the Synonym and everything up to 2 paragraph characters (^13^13).
  • \1^13\3^13^13: replaces what was found with the Word, a paragraph character, the Synonym and 2 paragraph characters

enter image description here

Using notepad++ on your example data:

Find what: ^([^\r\n\,]+)[^\r\n]*\r\n([^\r\n]+)\r\n[^\r\n]+\r\n[^\r\n]+$
Replace with: \1\r\n\2

Gives this:

enter image description here

Have you thought of importing it into a spreadsheet as a CSV? To make it easier to import, you'd need a different separator then the end-of-line characters (carriage return plus line feed, CR+LF, ASCII[13] + ASCII[10] in Windows OS) for each column, because you also use it to separate records, but that is easily fixed. The directions below apply to the free Notepad++ text editor, but similar capability is in most other test editors and word processors. Before you start, though make a backup copy of the original file so that you may revert to it if something goes wrong.

Notepad++ Serach & Replace

  • Search for CR/LF.
    • Press CtrlH to open the Replace dialog.
    • Enable Extended... character I/O.
    • Search for \r\n if using Windows newline convention.
    • Replace with \t, the tab character.
  • Now replace the doubled tabs, where you had double-spaced each record, with CR/LF. Follow the previous directions, but search for two tabs, \t\t, and replace with CR/LF, \r\n.

If you want to eliminate part of speech and word forms, you'd need to search and replace the beginning of each record from the first comma to the tab, since there are commas in the examples.

Save the file in Notepad++ with the extension .csv, which is a standard extension for "comma separated values", though here we used the Tab character, because tou do not seem to use it internally in your file.

Open the file with a spreadsheet, such as Excel from Microsoft Office, or free LibreOffice Calc, if you do not have MS Office. Set Tab as the column separator (CR/LF is always the record separator). The first column should be the base word, second column synonyms, and the third and fourth column, definition and sample sentence, can be deleted.

Calc .csv import