Java lock concept how internally works?

The other answer describes what the language definition says, but not what "happens internally".

Every object in Java has a two word object header. The mark word and the klass pointer. The first word (the mark word) is used for storing locking information and caching the hash code. The second word is a pointer to the klass object storing the static information (including method code) for that object.

HotSpot JVM has some fancy locking stuff including thin locks and biased locking which basically means that if you never lock an object or if you never have any contention then you will never create a monitor object (which is something that stores extra locking information).

A monitor object has an entry set. When you lock that object if the object is already locked then your thread gets added to the entry set. When you unlock the object you wake up a thread in the entry set.

Concurrency is a very complicated field and there's obviously a lot more details.

UPDATE

Object header is explained here and the details of what happens with an object monitor i.e. wait sets can be found in this openjdk code.


As always, the JLS provides the answer (17.1) :

The most basic of these methods is synchronization, which is implemented using monitors. Each object in Java is associated with a monitor, which a thread can lock or unlock. Only one thread at a time may hold a lock on a monitor. Any other threads attempting to lock that monitor are blocked until they can obtain a lock on that monitor. A thread t may lock a particular monitor multiple times; each unlock reverses the effect of one lock operation.

So, no, lock is not like a field in the Object (as you can see by simply looking at Object's source code). Rather, each Object is associated with a "monitor", and it is this monitor which is locked or unlocked.

I just wanted to point out a further reference which details "how Java does it" to make sure it's not overlooked. This is located in the comments of the C++ code which @selig discovered below, and I encourage all upvotes for the content below to go to his answer. You can view the full source code in the link provided there.

  126 // -----------------------------------------------------------------------------
  127 // Theory of operations -- Monitors lists, thread residency, etc:
  128 //
  129 // * A thread acquires ownership of a monitor by successfully
  130 //   CAS()ing the _owner field from null to non-null.
  131 //
  132 // * Invariant: A thread appears on at most one monitor list --
  133 //   cxq, EntryList or WaitSet -- at any one time.
  134 //
  135 // * Contending threads "push" themselves onto the cxq with CAS
  136 //   and then spin/park.
  137 //
  138 // * After a contending thread eventually acquires the lock it must
  139 //   dequeue itself from either the EntryList or the cxq.
  140 //
  141 // * The exiting thread identifies and unparks an "heir presumptive"
  142 //   tentative successor thread on the EntryList.  Critically, the
  143 //   exiting thread doesn't unlink the successor thread from the EntryList.
  144 //   After having been unparked, the wakee will recontend for ownership of
  145 //   the monitor.   The successor (wakee) will either acquire the lock or
  146 //   re-park itself.
  147 //
  148 //   Succession is provided for by a policy of competitive handoff.
  149 //   The exiting thread does _not_ grant or pass ownership to the
  150 //   successor thread.  (This is also referred to as "handoff" succession").
  151 //   Instead the exiting thread releases ownership and possibly wakes
  152 //   a successor, so the successor can (re)compete for ownership of the lock.
  153 //   If the EntryList is empty but the cxq is populated the exiting
  154 //   thread will drain the cxq into the EntryList.  It does so by
  155 //   by detaching the cxq (installing null with CAS) and folding
  156 //   the threads from the cxq into the EntryList.  The EntryList is
  157 //   doubly linked, while the cxq is singly linked because of the
  158 //   CAS-based "push" used to enqueue recently arrived threads (RATs).
  159 //
  160 // * Concurrency invariants:
  161 //
  162 //   -- only the monitor owner may access or mutate the EntryList.
  163 //      The mutex property of the monitor itself protects the EntryList
  164 //      from concurrent interference.
  165 //   -- Only the monitor owner may detach the cxq.
  166 //
  167 // * The monitor entry list operations avoid locks, but strictly speaking
  168 //   they're not lock-free.  Enter is lock-free, exit is not.
  169 //   See http://j2se.east/~dice/PERSIST/040825-LockFreeQueues.html
  170 //
  171 // * The cxq can have multiple concurrent "pushers" but only one concurrent
  172 //   detaching thread.  This mechanism is immune from the ABA corruption.
  173 //   More precisely, the CAS-based "push" onto cxq is ABA-oblivious.
  174 //
  175 // * Taken together, the cxq and the EntryList constitute or form a
  176 //   single logical queue of threads stalled trying to acquire the lock.
  177 //   We use two distinct lists to improve the odds of a constant-time
  178 //   dequeue operation after acquisition (in the ::enter() epilog) and
  179 //   to reduce heat on the list ends.  (c.f. Michael Scott's "2Q" algorithm).
  180 //   A key desideratum is to minimize queue & monitor metadata manipulation
  181 //   that occurs while holding the monitor lock -- that is, we want to
  182 //   minimize monitor lock holds times.  Note that even a small amount of
  183 //   fixed spinning will greatly reduce the # of enqueue-dequeue operations
  184 //   on EntryList|cxq.  That is, spinning relieves contention on the "inner"
  185 //   locks and monitor metadata.
  186 //
  187 //   Cxq points to the the set of Recently Arrived Threads attempting entry.
  188 //   Because we push threads onto _cxq with CAS, the RATs must take the form of
  189 //   a singly-linked LIFO.  We drain _cxq into EntryList  at unlock-time when
  190 //   the unlocking thread notices that EntryList is null but _cxq is != null.
  191 //
  192 //   The EntryList is ordered by the prevailing queue discipline and
  193 //   can be organized in any convenient fashion, such as a doubly-linked list or
  194 //   a circular doubly-linked list.  Critically, we want insert and delete operations
  195 //   to operate in constant-time.  If we need a priority queue then something akin
  196 //   to Solaris' sleepq would work nicely.  Viz.,
  197 //   http://agg.eng/ws/on10_nightly/source/usr/src/uts/common/os/sleepq.c.
  198 //   Queue discipline is enforced at ::exit() time, when the unlocking thread
  199 //   drains the cxq into the EntryList, and orders or reorders the threads on the
  200 //   EntryList accordingly.
  201 //
  202 //   Barring "lock barging", this mechanism provides fair cyclic ordering,
  203 //   somewhat similar to an elevator-scan.
  204 //
  205 // * The monitor synchronization subsystem avoids the use of native
  206 //   synchronization primitives except for the narrow platform-specific
  207 //   park-unpark abstraction.  See the comments in os_solaris.cpp regarding
  208 //   the semantics of park-unpark.  Put another way, this monitor implementation
  209 //   depends only on atomic operations and park-unpark.  The monitor subsystem
  210 //   manages all RUNNING->BLOCKED and BLOCKED->READY transitions while the
  211 //   underlying OS manages the READY<->RUN transitions.
  212 //
  213 // * Waiting threads reside on the WaitSet list -- wait() puts
  214 //   the caller onto the WaitSet.
  215 //
  216 // * notify() or notifyAll() simply transfers threads from the WaitSet to
  217 //   either the EntryList or cxq.  Subsequent exit() operations will
  218 //   unpark the notifyee.  Unparking a notifee in notify() is inefficient -
  219 //   it's likely the notifyee would simply impale itself on the lock held
  220 //   by the notifier.
  221 //
  222 // * An interesting alternative is to encode cxq as (List,LockByte) where
  223 //   the LockByte is 0 iff the monitor is owned.  _owner is simply an auxiliary
  224 //   variable, like _recursions, in the scheme.  The threads or Events that form
  225 //   the list would have to be aligned in 256-byte addresses.  A thread would
  226 //   try to acquire the lock or enqueue itself with CAS, but exiting threads
  227 //   could use a 1-0 protocol and simply STB to set the LockByte to 0.
  228 //   Note that is is *not* word-tearing, but it does presume that full-word
  229 //   CAS operations are coherent with intermix with STB operations.  That's true
  230 //   on most common processors.
  231 //
  232 // * See also http://blogs.sun.com/dave
  233 
  234 
  235 // -----------------------------------------------------------------------------