You can find it in action on regex101 with the regex indeed matching the query string in the maliciouswebsite and not matching even just something with the port and no user/password
It is valid (just weird & not recommended) to give a user:pw combo to a website that doesn’t ask for one in the headers. Browsers stripping it off is a different thing
The sheer number of things you have to take into account to properly parse a URL should convince you to not use regexes for it
The fact that it’s less code, more correct, faster and more readable to use new URL() should also be enough to convince you to not use regexes
Having seen original source code hasn’t been an issue in previous cases where the reimplementation was done in another language with the changes one would expect coding up something a second time, I believe