'Don't parse markup languages with Regex' is an annoying trollpost and it should die... right?

ChubakPDP11+TakeWithGrainOfSalt@programming.dev · 2 years ago

'Don't parse markup languages with Regex' is an annoying trollpost and it should die... right?

Static_Rocket@lemmy.world · 2 years ago

I mean, you shouldn’t really parse any language (markup, script or otherwise) with regex. The point is there are other tools for the job that get you closer to what the actual interpreter is expecting to see. It’s really easy to botch a regex and accidentally create new syntax matches.

Regex is fine for noncritical parsing. You’ll often see it used for text editor / IDE language syntax highlighting, but you should also have noticed by now how often that tends to break down with more extreme combinations of syntax.

I do agree that lexers should still be preferred for manual manipulation of existing langues, but do whatever you want. It’s not like any of us can stop you.