Regular expression for git repository

Roughly

^[^@]+@[^:]+:[^/]+/[^.]+\.git$

Git accepts a large range of repository URL expressions:

* ssh://[email protected]:port/path/to/repo.git/
* ssh://[email protected]/path/to/repo.git/
* ssh://host.xz:port/path/to/repo.git/
* ssh://host.xz/path/to/repo.git/
* ssh://[email protected]/path/to/repo.git/
* ssh://host.xz/path/to/repo.git/
* ssh://[email protected]/~user/path/to/repo.git/
* ssh://host.xz/~user/path/to/repo.git/
* ssh://[email protected]/~/path/to/repo.git
* ssh://host.xz/~/path/to/repo.git
* [email protected]:/path/to/repo.git/
* host.xz:/path/to/repo.git/
* [email protected]:~user/path/to/repo.git/
* host.xz:~user/path/to/repo.git/
* [email protected]:path/to/repo.git
* host.xz:path/to/repo.git
* rsync://host.xz/path/to/repo.git/
* git://host.xz/path/to/repo.git/
* git://host.xz/~user/path/to/repo.git/
* http://host.xz/path/to/repo.git/
* https://host.xz/path/to/repo.git/
* /path/to/repo.git/
* path/to/repo.git/
* ~/path/to/repo.git
* file:///path/to/repo.git/
* file://~/path/to/repo.git/

For an application that I wrote that requires parsing of these expressions (YonderGit), I came up with the following (Python) regular expressions:

    (1) '(\w+://)(.+@)*([\w\d\.]+)(:[\d]+){0,1}/*(.*)'
    (2) 'file://(.*)'       
    (3) '(.+@)*([\w\d\.]+):(.*)'

For most repository URL's encountered "in the wild", I suspect (1) suffices.


I'm using the following regular expression for online remote repositories:

((git|ssh|http(s)?)|(git@[\w\.]+))(:(//)?)([\w\.@\:/\-~]+)(\.git)(/)?

View on Debuggex

Regular expression visualization


FYI I make a regex for get owner and repo from github or bitbucket:

(?P<host>(git@|https://)([\w\.@]+)(/|:))(?P<owner>[\w,\-,\_]+)/(?P<repo>[\w,\-,\_]+)(.git){0,1}((/){0,1})

Debuggex Demo

Tags:

Git

Regex