Are SObject Id's sequential & should they be used for defining batches in a large data retrieval?
(converted from comments)
Thoughts on this strategy?
You don't need to worry about such stuff
queryMore() will take care of bringing you different records each time, keeping the "ORDER BY" you've selected. You issue one query (can be without "ORDER BY", depends what does your business logic demands) and keep calling the
queryMore until the result set is exhausted. What you're planning to do sounds like some homemade pagination attempt that will mean more code to achieve similar ends compared to queryMore.
queryMore is like a cursor if you know that concept from Oracle or MSSQL for example (except not all cursors allow you to fetch data in blocks and here you can get up to 2K records in one chunk). Tools like Data Loader, (outdated) Force Explorer, Real Force Explorer etc all use same mechanism and they safely can fetch more than 50K.
SOAP API fetch can easily do more than 50K rows that way, REST API has a similar mechanism. Not sure what's the upper bound, I suspect the 50 million records around which you'll need to have a look at the bulk API. 50K limit is for rows retrieved in one Apex transaction (but if it's raw read it won't fire any triggers, won't invoke any webservices with complex logic...).
Would the following queries return the same sequence of items?
No / not always.
- Tim's answer.
Records inserted in 1 batch might end up with same timestamp so order by creation date could theoretically sort them randomly within same group:
SELECT Id, Name, CreatedDate FROM Account ORDER BY CreatedDate ASC
(this might not be completely true, maybe SF stores the values with miliseconds in the underlying Oracle DB and they're just always returned to us with "000"... but then counting on such behaviour could backfire anytime they decide to change something)
CreatedDate and other audit fields can be made writable (only during insert) if you'll ask SF support nicely. Common use case is to migrate to SF data from some old system but accurately indicate that Account or Opportunity were created X years ago and thus shouldn't impact this quarter's forecasts for example.
- I'd swear that we couldn't sort by ID but can't find proof in http://www.salesforce.com/us/developer/docs/soql_sosl/Content/sforce_api_calls_soql_select_orderby.htm. Maybe something has changed recently :) For example https://stackoverflow.com/questions/3112870/soql-query-to-fetch-more-than-2000-records indicates that at least in 2010 you couldn't use <, > in Id comparison.
- The more you dig, the funnier it looks. Read the comments in How to query records by insertion order in SOQL?.
Salesforce Ids can be sequential at times but it isn't guaranteed -- so not something you should bank on for sorting. They aren't like, say, MongoDB Ids. Here is a blurb from the Apex Testing Best Practices that makes it official:
Record IDs are not created in ascending order unless you insert multiple records with the same request. For example, if you create an account A, and receive the ID 001D000000IEEmT, then create account B, the ID of account B may or may not be sequentially higher.