Extract more than nine arguments that occur periodically in a sentence to use in macros in order to typset

(updated this answer signficantly to allow for multiple (expandable) macros in the argument of \fun.)

Here's a LuaLaTeX-based solution. It can handle multiple, expandable macros in the argument of \fun. The Lua code first splits the (expanded) input string into separate words, taking note of punctuation characters, if present. It then proceeds to print them, encasing the 3rd, 8th, 13th, 18th words in the \form macro. (Mathematically speaking, the selection criterion is that the word's position in the table, modulo 4, equals 3.) Non-ASCII UTF8-encoded characters are fine (because the unicode.utf8.gmatch function rather than the "basic" string.gmatch function is employed.)

enter image description here

% !TEX TS-program = lualatex
\documentclass{article}
\usepackage{luacode} % for 'luacode' environment and '\luastring' macro

%% Lua-side code
\begin{luacode}
function do_fun ( s )
  words = {}  -- initialize a Lua table
  -- split 's' into constituent words
  for word in unicode.utf8.gmatch ( s , "%w+%p?" ) do 
     table.insert ( words , word ) 
  end
  -- apply "form" macro at 3rd, 8th, 13th, etc words
  for i = 1 , #words do
    if i%5 == 3 then
       tex.sprint ( "\\form{"..words[i].."} " )
    else
       tex.sprint ( words[i].." " )
    end
  end
end
\end{luacode}

%% TeX-side code
\newcommand\fun[1]{\directlua{do_fun(\luastring{#1})}}
\newcommand\form[1]{\emph{#1}}
\newcommand\blurbA{Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Ut purus elit, vestibulum ut, placerat ac, adipiscing vitae, felis. Curabitur dictum gravida mauris.}
\newcommand\blurbB{Nam arcu libero, nonummy eget, consectetuer id, vulputate a, magna. Donec vehicula augue eu neque. Pellentesque habitant.}
\newcommand\blurbC{abc1 EDF1 xyz1 efg jkl abc2 EDF2 xyz2 efg jkl abc3 EDF3 xyz3 efg jkl abc4 EDF4 xyz4 efg jkl abc5 EDF5 xyz5 efg jkl abc6 EDF6 xyz6 efg jkl abc7 EDF7 xyz7 efg jkl abc8 EDF8 xyz8 efg jkl abc9 EDF9 xyz9 efg jkl abc10 EDF10 xyz10 efg jkl abc11 EDF11 xyz11 efg jkl abc12 EDF12 xyz12 efg jkl abc13 EDF13 xyz13}

\begin{document}
\raggedright
\fun{\blurbA\blurbB\blurbC} 
\end{document}

\documentclass{article}
\usepackage{xparse}

\ExplSyntaxOn
\NewDocumentCommand{\fun}{m}
 {
  % split the input at the spaces
  \seq_set_split:Nnn \l_tmpa_seq { ~ } { #1 }
  % use a counter for knowing where we are
  \int_zero:N \l_tmpa_int
  % map the sequence
  \seq_map_inline:Nn \l_tmpa_seq
   {% one more step
    \int_incr:N \l_tmpa_int
    \int_compare:nTF { \int_mod:nn { \l_tmpa_int - 3 } { 5 } = 0 }
     {% if we're at the 3rd, 5th, 8th, 13th, ... item, apply \form
      \form { ##1 }
     }
     {% otherwise just deliver the item
      ##1
     }
    % if not at the last, add a space
    \int_compare:nT { \l_tmpa_int < \seq_count:N \l_tmpa_seq } { ~ }
   }
 }
\ExplSyntaxOff

\NewDocumentCommand{\form}{m}{\emph{#1}}

\begin{document}

\raggedright

\fun{Non eram nescius Brute cum quae summis ingeniis exquisitaque 
doctrina philosophi Graeco sermone tractavissent ea Latinis 
litteris mandaremus fore ut hic noster labor in varias 
reprehensiones incurreret Nam quibusdam et iis quidem non 
admodum indoctis totum hoc displicet philosophari Quidam 
autem non tam id reprehendunt si remissius agatur sed tantum 
studium tamque multam operam ponendam in eo non arbitrantur 
Erunt etiam et ii quidem eruditi Graecis litteris contemnentes 
Latinas qui se dicant in Graecis legendis operam malle consumere 
Postremo aliquos futuros suspicor qui me ad alias litteras 
vocent genus hoc scribendi etsi sit elegans personae tamen 
et dignitatis esse negent}

\end{document}

enter image description here


\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{listofitems,tabto}
\newcounter{formtrigger}
\newcommand\form[1]{\emph{#1}}
\newcommand\fun[1]{%
  \setsepchar{ }%
  \readlist*\funlist{#1}%
  \setcounter{formtrigger}{2}%
  \foreachitem\x\in\funlist[]{%
    \stepcounter{formtrigger}%
    \ifnum\theformtrigger=5\relax
      \form{\x}\setcounter{formtrigger}{0}%
    \else%
      \x%
    \fi%
    \ %
  }
}
\begin{document}
\fun{abc1 EDF1 xyz1 efg jkl abc2 EDF2 xyz2 efg jkl abc3 EDF3 xyz3 efg jkl abc4 EDF2 xyz4 ...}
\end{document}

enter image description here

ORIGINAL ANSWER

The listofitems package can grab these inputs very easily, preserving the original tokens without expansion.

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{listofitems,tabto}
\newcommand\form[1]{\emph{#1}}
\newcommand\fun[1]{%
  \setsepchar{ }%
  \readlist\funlist{#1}%
  \foreachitem\x\in\funlist[]{%
    Argument \xcnt{} is\tabto{1.3in}``\detokenize\expandafter{\x}'': 
    \tabto{2.5in}\x\par
  }%
}
\begin{document}
\fun{abc1 EDF1 xyz1 efg jkl abc2 EDF2 xyz2 efg jkl abc3 EDF3 xyz3 
  efg jkl abc4 EDF2 xyz4 ... abc1 EDF1 \form{xyz1} efg jkl abc2 
  EDF2 \form{xyz2} efg jkl abc3 EDF3 \form{xyz3} efg jkl abc4 
  EDF2 \form{xyz4} ...}
\end{document}

enter image description here

If you need multi-layer parsing, say that efg jkl is the trigger to separate larger subgroups of arguments, then we have the following (note: efg jkl is not considered an argument, but an argument separator):

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{listofitems,tabto}
\newcommand\form[1]{\emph{#1}}
\newcommand\fun[1]{%
  \setsepchar{efg jkl/ }%
  \readlist*\funlist{#1}%
  \foreachitem\x\in\funlist[]{%
    \foreachitem\y\in\funlist[\xcnt]{%
    Group \xcnt{} sub-argument \ycnt{} is\tabto{2in}``\detokenize
      \expandafter\expandafter\expandafter{\funlist[\xcnt,\ycnt]}'': 
    \tabto{3.2in}\funlist[\xcnt,\ycnt]\par
  }}%
}
\begin{document}
\fun{abc1 EDF1 xyz1 efg jkl abc2 EDF2 xyz2 efg jkl abc3 EDF3 xyz3 
  efg jkl abc4 EDF2 xyz4 ... abc1 EDF1 \form{xyz1} efg jkl abc2 
  EDF2 \form{xyz2} efg jkl abc3 EDF3 \form{xyz3} efg jkl abc4 
  EDF2 \form{xyz4} ...}
\end{document}

enter image description here