Salesforce, Oracle and the Seven Dwarfs

This seems timely as Grumpy just popped out of the developer console at me.

Grumpy Dwarf in Salesforce

As per @sfdcfox's comment and answer to the linked question.

The Seven Dwarfs live in the platform code. You should never see DOPEY, SLEEPY, DOC, GRUMPY, SNEEZY, BASHFUL, or HAPPY, but occasionally they break out and appear to a user. If you spot one of them, you need to contact support so they can be put back in their dwarfy walled garden.

There are references to them back in 2009. See Salesforce, Snow White and the Seven Dwarfs and The Silver Lining - Meaningful Error Messages #94.

I suspect anyone who can tell you exactly where they are in relation to the database has probably signed a non-disclosure agreement. That doesn't mean we can't make some wild conjectures.

The presence of Oracle error codes like ORA-00001 and ORA-20191 plus the java.sql.SQLException would indicate they are some sort of layer immediately above the database running Java.

Its also interesting that they tend to bypass the GACK. Either they aren't being caught by the GACK mechanism or they are being rethrown. The latter seems unlikely. Why surface these errors rather than the generic GACK message. So the dwarfs live outside the standard application GACK error handling.

Why are there only 7? I'd assume it has something to do with the way Salesforce scales. Each instance/pod, e.g. na7, may have exactly 7 dwarfs. There are never any more or less and they would stand up a new pod rather than scale out the dwarfs. If you've got seven fixed servers I guess you are either naming them after dwarfs or deadly sins. Getting errors from Happy are probably better than errors from Lust or Gluttony.


Here's slide 8 from a Salesforce.com -architecture talk that I've defaced with where I think the dwarfs roughly reside in relation to the servers that makeup a pod and it's database cluster. enter image description here


Here's another diagram from The Force.com Multitenant Architecture pdf.

The Query Servers might be in about the right place and performing the expected dwarf functions.

enter image description here


In Performance Monitoring and Testing in the Salesforce Cloud the Pod is shown as "30 plus servers" being used for Application,API, and Search servers over a "8 Node RAC cluster". I've never done infrastructure at that scale, but I suspect there is something working between all those servers and the database cluster.

enter image description here


See also: * Quora: What does Salesforce's infrastructure look like?


While I cannot provide insight to the naming conventions of the salesforce devs, these errors seem to imply that the database is timing out when attempting to acquire concurrency control locks on subsets of the database that are already locked/currently being accessed by another transaction. Perhaps you have conflicting transactions in your code.

Tags:

Apex