Copy only Specific text of a file to another

I assume the file follows the same pattern. If that is the case, you can have a command like below.

grep -o ' path=.*$' file.txt | cut -c8- |rev | cut -c 4- | rev

So, I open the file using cat and then I extract only the characters from path= and then I remove the unwanted characters using cut and then I use the rev technique to remove unwanted characters from the end.

Another awk approach

awk -F'path="' '{print $2}' file.txt |rev | cut -c 4- | rev

I use the path=" as delimiter and print all the information after it. And the rev basically does the same as above.

Testing

cat file.txt
<classpathentry kind="src" path="Sources"/>
<classpathentry kind="con" path="WOFramework/ERExtensions"/>
<classpathentry kind="con" path="WOFramework/ERJars"/>
<classpathentry kind="con" path="WOFramework/ERPrototypes"/>
<classpathentry kind="con" path="WOFramework/JavaEOAccess"/>
<classpathentry kind="con" path="WOFramework/JavaEOControl"/>
<classpathentry kind="con" path="WOFramework/JavaFoundation"/>
<classpathentry kind="con" path="WOFramework/JavaJDBCAdaptor"/>

After running the command,

Sources
WOFramework/ERExtensions
WOFramework/ERJars
WOFramework/ERPrototypes
WOFramework/JavaEOAccess
WOFramework/JavaEOControl
WOFramework/JavaFoundation
WOFramework/JavaJDBCAdaptor

A better approach as provided by Stephane in comments.

cut -d '"' -f4 file.txt

A simple approach with awk:

awk -F\" '/WOF/ {print $4}' abc.txt > outfile
  • -F\" changes the field separator from the default (a space) to a quote mark (escaped with \)
  • /WOF/ restricts the returned results of each record (line of the file) to those that match the pattern: WOF
  • $4 is the fourth field for each of those matching records, the path.

Another approach with grep and cut:

grep "kind=\"con\"" sample.txt | cut -d \" -f 4 > sample_edited.txt

This will grep all lines containing kind="con" and print the paths by setting cut's delimiter to ".