How to search for code in GitHub with GitHub API?

2020: As detailed in Mark Z.'s answer, using an authentication (Authorization': 'Token xxxx') allows for a code search.

get /search/code

You can use:

  • either a dedicated command-line tool like feinoujc/gh-search-cli

    ghs code --extension js "import _ from 'lodash'"
    
  • or the official GitHub CLI gh, (after a gh auth login) as show in issue 5117:

    gh api --method=GET "search/code?q=filename:test+extension:yaml+org:new-org"
    

    Or even:

    gh api --method=GET search/code -f q='filename:test extension:yaml org:new-org' \
           --jq '.items[] | [.repository.full_name,.path,.sha] | @tsv'
    

    That would get a line-based, tab-separated list of fields in this order: repo name, file path, git sha. (see gh help formatting)


2014 (original answer): That seems related to the new restriction "New Validation Rule for Beta Code Search API" (Oct. 2013)

In order to support the expected volume of requests, we’re applying a new validation rule to the Code Search API. Starting today, you will need to scope your code queries to a specific set of users, organizations, or repositories.

So, the example of the API search code mentions now:

Suppose you want to find the definition of the addClass function inside jQuery. Your query would look something like this:

https://api.github.com/search/code?q=addClass+in:file+language:js+repo:jquery/jquery


You can do a code search without specifying a user/org/repo if you authenticate.

First, generate a personal access token for use for this purpose, from your Profile on GitHub's website: Settings -> Developer Settings -> Personal Access Token -> Generate New Token (you can leave all access options unticked, since you're just using to make web requests)

Now, your original GET request will work and return results, if you append the token to it:

https://api.github.com/search/code?q=addClass&access_token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

UPDATE: OCT 2021 As pointed out by a comment below, passing the token in via a query parameter (like above) is deprecated. You must now add it as an Authorization header.

e.g.

curl --location --request GET 'https://api.github.com/search/code?q=addClass +in:file +language:csharp' \
--header 'Authorization: Token xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

or in Python:

import requests

url = "https://api.github.com/search/code?q=addClass +in:file +language:csharp"

headers = {
  'Authorization': 'Token xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
}

response = requests.request("GET", url, headers=headers)

print(response.text)

While Gihub does not currently support code search without repo, user, or organization (see VonC's answer), codesearch does index some sources from Github via the codesearch API, albeit with an API less fully featured out than Github's.

For example, to search for wget invocations indexed from Github, call

curl "https://searchcode.com/api/codesearch_I/?q=wget&src=2"

The optional src parameter picks the code source (e.g., Github, BitBucket) that should be searched, and you can find its integer value for a source by altering the parameters of faceted search in the codesearch UI. The current value of src for Github is 2.

You can verify that the returned results from the above example come from github.com by viewing the the repo property of results items.