Need explanation on this regex -



Need explanation on this regex -

i have regex used split string:

,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)

e.g. string

"field1","field2","item1,item2,item3","hello,""john"""

the 1 thing understand is splitting string on , after not sure.

if can explain regex please.

if can dissect simplest possible level, appreciate it.

this regex matching comma , only if outside double quotes counting number of quotes after literal ,.

explanation:

, -> match literal comma (?=...) -> positive lookahead [^"]*" -> match before " followed literal " [^"]*"[^"]*" -> match pair of above (?:[^"]*"[^"]*")* -> match 0 or more of pairs (0, 2, 4, 6 sets) [^"]*$ -> followed non-quote till end of string

example input:

"field1,field2","field3","item1,item2,item3" first match , before "field3" because lookahead: (?=(?:[^"]*"[^"]*")*[^"]*$) making sure there 4 double quotes after comma. second match , after "field3" because lookahead: (?=(?:[^"]*"[^"]*")*[^"]*$) making sure there 2 double quotes after comma. it not matching comma between field1 , field2 because # of quotes after odd in numbers , hence lookahead (?=(?:[^"]*"[^"]*")*[^"]*$) fail.

regex

Comments

Popular posts from this blog

xslt - DocBook 5 to PDF transform failing with error: "fo:flow" is missing child elements. Required content model: marker* -

mediawiki - How do I insert tables inside infoboxes on Wikia pages? -

Local Service User Logged into Windows -