Skip to content

API Rate Limit Exceeding #258

@sbhaktha

Description

@sbhaktha

Hi @kohsuke,

I would appreciate your help on this.

I have used the HttpConnector and have specified a cache directory in my server. I am using OAuth and it looks like my quota is 5000 requests, based on this sample cache file:

https://api.github.com/repos/allenai/aristo-tables/contents/tables/weather_terms?ref=master
GET
2
Authorization: token e381d0427927aef5e2858ac06b6cb01a34b0a603
Accept-Encoding: gzip
HTTP/1.1 200 OK
30
Server: GitHub.com
Date: Thu, 10 Mar 2016 20:57:33 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Status: 200 OK
**X-RateLimit-Limit: 5000**
X-RateLimit-Remaining: 4689
X-RateLimit-Reset: 1457646852
Cache-Control: private, max-age=60, s-maxage=60
Vary: Accept, Authorization, Cookie, X-GitHub-OTP
ETag: W/"c2dc693298f7806038e984d1ac857ffb"
Last-Modified: Tue, 08 Mar 2016 23:54:00 GMT
X-OAuth-Scopes: read:repo_hook, repo
X-Accepted-OAuth-Scopes:
X-OAuth-Client-Id: 47355241bdf02ac9122d
X-GitHub-Media-Type: github.v3; format=json
Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
Access-Control-Allow-Origin: *
Content-Security-Policy: default-src 'none'
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
Vary: Accept-Encoding
X-Served-By: 01d096e6cfe28f8aea352e988c332cd3
Content-Encoding: gzip
X-GitHub-Request-Id: 36D5C9C0:101B5:A516A26:56E1DFBD
OkHttp-Selected-Protocol: http/1.1
OkHttp-Sent-Millis: 1457643453839
OkHttp-Received-Millis: 1457643453959

My client refreshes periodically to be in sync with the repo, however, even though there has been no change in the repo, I run out of API rate limit every now and then. I thought it should just be reading from the cache.

The following call gets executed on every refresh:

  private def getTableDirs(
    oauthAccessToken: String,
    repo: GitRepoInfo,
    tableNamesFilter: Option[Seq[String]]
  ): Seq[GHContent] = {
    blocking {
      // Create a GitHubBuilder to be able to build a GitHub object with required
      // RateLimitHandler strategy and OAuth parameters. Instead of waiting, this will
      // throw an exception immediately if the request limit is exceeded.
      val gitHubBuilder =
        new GitHubBuilder()
          .withRateLimitHandler(RateLimitHandler.FAIL)
          .withOAuthToken(oauthAccessToken)
          .withConnector(
            new OkHttpConnector(
              new OkUrlFactory(
                new OkHttpClient().setCache(cache))))
      val github = gitHubBuilder.build()

      // Get the requested repo.
      val repoName = repo.fork + "/" + repo.repo
      val repository = github.getRepository(repoName)
      // Get all directories (expected to be Table directories) from the top level of the repo.
      val allTableDirs =
        repository.getDirectoryContent("tables", repo.branch).asScala.filter(_.isDirectory)
      // If there is a filter, restrict returned table directories to that set, if not return all.
      tableNamesFilter match {
        case Some(filter) =>
          val tableSet = filter.map(_.toLowerCase).toSet
          allTableDirs.filter(d => tableSet.contains(d.getName.toLowerCase))
        case None =>
          allTableDirs
      }
    }
  }

Further, there are other calls like ghContent.read -- is each of these a separate request to Git? Even so, I wouldn't think they would be called every time but just looked up from the cache.

Any ideas?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions