Extract repository name from GitHub url in bash

Solution 1:

$ url=git://github.com/some-user/my-repo.git
$ basename=$(basename $url)
$ echo $basename
my-repo.git
$ filename=${basename%.*}
$ echo $filename
my-repo
$ extension=${basename##*.}
$ echo $extension
git

Solution 2:

I'd go with basename $URL .git.


Solution 3:

Old post, but I faced the same problem recently.

The regex ^(https|git)(:\/\/|@)([^\/:]+)[\/:]([^\/:]+)\/(.+).git$ works for the three types of URL.

#!/bin/bash

# url="git://github.com/some-user/my-repo.git"
# url="https://github.com/some-user/my-repo.git"
url="[email protected]:some-user/my-repo.git"

re="^(https|git)(:\/\/|@)([^\/:]+)[\/:]([^\/:]+)\/(.+).git$"

if [[ $url =~ $re ]]; then    
    protocol=${BASH_REMATCH[1]}
    separator=${BASH_REMATCH[2]}
    hostname=${BASH_REMATCH[3]}
    user=${BASH_REMATCH[4]}
    repo=${BASH_REMATCH[5]}
fi

Explaination (see it in action on regex101):

  • ^ matches the start of a string
  • (https|git) matches and captures the characters https or git
  • (:\/\/|@) matches and captures the characters :// or @
  • ([^\/:]+) matches and captures one character or more that is not / nor :
  • [\/:] matches one character that is / or :
  • ([^\/:]+) matches and captures one character or more that is not / nor :, yet again
  • [\/:] matches the character /
  • (.+) matches and captures one character or more
  • .git matches....git, literally
  • $ matches the end of a string

This if far from perfect, as something like [email protected]:some-user/my-repo.git would match, but I think it's fine enough for extraction.


Solution 4:

Summing up:

  • Get url without (optional) suffix:

    url_without_suffix="${url%.*}"
    
  • Get repository name:

    reponame="$(basename "${url_without_suffix}")"
    
  • Get user (host) name afterwards:

    hostname="$(basename "${url_without_suffix%/${reponame}}")"