Concurrency Limits

Concurrency Limits Overview

ICE Mortgage Technology enforces concurrent rate limits, which is a limit on the number of simultaneous API calls that are made at a given time.

Each lender has a hard stop limit of concurrent API calls that is set for each Encompass environment/instance.  This limit is across all services submitting calls to the given instance, including lender and partner API calls.  The default limit for each lender is set during Encompass instance provisioning.

If the lender's hard stop limit of concurrent API calls is utilized, then all processes (partner and lender initiated calls) are not allowed.

Working with Concurrency Limits

The header of each API response has two concurrency key/value pairs including X-Concurrency-Limit-Limit and X-Concurrency-Limit-Remaining. This information tells the API consumer their allocated concurrency limit, along with the remaining concurrency at the given time.

For lenders and partners that, by the nature of their service and integration, are expected to make a large volume of calls to a lender’s instance (event based services, downloading and uploading large volumes of attachments, use of threading and parallel processing, etc.) it is important to design the integration in a way that ensures that the API caller is able to limit and configure the number of concurrent calls made to a lender’s environment. If the lender's hard stop limit of concurrent API calls is utilized, then all processes (partner and lender initiated calls) are not allowed, a 429 - Too Many Requests status code is returned.  At this point, the lender or any Integration Applications cease to execute. The 429 - Too Many Requests status code will continue to be returned until the concurrency remaining rate is greater then 0.

Best practice is to use an exponential back-off in concurrency violations (the concurrency limit has been reached) and proactive throttling (the available concurrency is outside an acceptable range) and not to exceed 20 percent of the allow Concurrency Limit.

The header of each API response has two concurrency key/value pairs: X-Concurrency-Limit-Limit and X-Concurrency-Limit-Remaining. The Integration Application can use these to calculate the ratio. (X-Concurrency-Limit-Remaining/X-Concurrency-Limit-Limit). If this ratio is below 20 percent, the Integration Application will need to incorporate a proactive throttling process to increase the X-Concurrency-Limit-Remaining. The ratio can also be a configuration in your Integration Application.

Best Practices for Efficient Use of API Concurrency

The best way to avoid being throttled is to reduce the number of simultaneous requests, this can normally be achieved through code optimization.

Consider the following practices when designing your app/integration:

  • Cache frequently used data rather than fetching it repeatedly
  • Batch requests when processing webhook notifications
  • Eliminate unnecessary and redundant API requests
  • Use a message queue to regulate call frequency
  • Implement an exponential back-off to gracefully recover from throttle errors
  • Be cautious with timeouts – concurrency is a factor of how long requests take to execute, not how long your app waits for responses
  • Use metadata, included with API responses, to meter and prioritize requests
  • Reduce latency by initiating requests as a super administrator – fewer business rules are applied with this type of account

Are you using the right automation for your use case?

  • Webhooks, List APIs, Batch Calls - choose the right tool for your use case
  • Use webhook events to listen for changes instead of polling via GET calls
  • Get multiple attachments in a single call
  • Leverage the Pipeline API to avoid multiple requests to retrieve loans
  • Take advantage of Enhanced Field Change Events (GA in 24.2) which ALREADY includes the changed data