How is binary data 'split'?

It's a timing thing (I don't like the word 'just').

In the case of a UART, in which the A stands for Asynchronous, there is a line idle state, then a start bit, which edge the receiver uses to synchronise its timing. After that, all the bits are sent each with an equal period. The word finishes with a stop bit, which ensures there is a transition into any following start bit.


Typically both sides have to know ahead of time what the bitrate is.

If not then you can sometimes detect the speed by looking at the shortest state change. 0000011111111101111111 but it may be a while and you may see something shorter.

Usually both sides know ahead of time. Some things like ethernet, pcie, etc will use a slow agreed on speed to talk to each other, then switch to or test a higher speed as part of a training exercise.

There are also bit encoding solutions, irig 106 chapter 4 if I remember right has a nice chart, with nrz-l,s,m biphase-l/m/s etc. When you use biphase, biphase-l in particular some industries call it manchester or manchester II encoding, either way, a bit is defined with a mid bit cell state change so each bit is two half bit cells with a high to low or low to high in the middle which determines a one or zero. For long streams of data or continuous data, you will never see more than two half bit cells at the same level by definition, making it much easier to bitsync to the otherside. so long as both sides agree the data encoding is biphase-l.

In addition to figuring out how often to sample, there is another typical problem if each side is using its own oscillator. So even with an agreed upon data rate each side drifts over time, so you have to solve that somehow. some situations like uart where your frame/message is very short you can use the start bit edge to both define where a mid bit cell sample would be for an agreed upon rate, and for that rate 10 cells or so you can have a fair amount of difference between clocks and it still works out. If you really wanted to for every edge that does show (if any) you can re-adjust.

In the telemetry world (the irig 106 thing) the data is continuous and the bitsync handles oversampling and keeping track, but you need some number of transitions over time to be able to do this thus those encoding solutions that insure bit transitions.

some protocols insure transitions in other ways. ethernet doesnt gracefully solve this, the receiver can/will use a recovered clock to sample the data, but the data once made into bytes and packets gets buffered and dealt with, if your crystal is a little slower than the crystal of the sender and you are at line rate, and you are doing a store and forward thing (are a router lets say) then you will eventually overflow your buffers and have to flow control. so that is the mechanism there.

then of course as commented, if you are synchronous instead of async, you "simply" carry the clock with you, but depending on the speeds and cable lengths there can be a shift between clock and data so that has to be dealt with. spi has this problem certainly as some products (like flashes) see your request that you sent on the rising edge, they start their answer on the falling edge and you sample based on your rising edge, so you have half a clock cycle for the round trip, if you for example put a cpld or fpga inline so that you can backdoor access the flash (rather than wire-or share the signals) you can make the round trip too long. QDR memory has this issue and some solutions for it as well, same story parallel data with a clock along for the ride if the signals have any timing issues you can be sampling one or both sides at the wrong time. pci carries a clock with it but it is the reference clock not the gigaherz clock used to make the data (its the reference clock used to make the gigahertz clock).

short answer, most of the time both sides agree on the speed ahead of time, either by protocol, fixed speed, or by a training exercise at an agreed/defined slower speed, or purely we simply agree like uart, I know the other side is 1152008N1 so I set my side to that. On occasion with uart for example you can detect a speed but you have to have the other side send characters that allow that...(have the right transitions not 00110011 but 101 or 010 somewhere.


There are two general approaches:

Asynchronous

Asynchronous serial communication is a form of serial communication in which the communicating endpoints' interfaces are not continuously synchronized by a common clock signal.

A UART is a piece of hardware that enables a device to send and receive data asynchronously -- that is, not synchronized to a clock signal.

If you were to look at an RS232 serial port pinout, you'd notice that there are Receive Data (RXD) and Transmit Data (TXD) pins, but a "clock" pin is notably absent.

The two sides of the communications channel have to agree (ahead of time) to the rate at which bits are going to be sent. This is why you have to configure the baud rate (symbol rate) in order for your serial port devices to work.

You'll also see things like "start" and "stop" bits -- these are used to help synchronize the receiver to the incoming stream.

Asynchronous busses include:

  • SATA
  • UART
  • USB

Synchronous

Synchronous communication requires that the clocks in the transmitting and receiving devices are synchronized – running at the same rate – so the receiver can sample the signal at the same time intervals used by the transmitter. No start or stop bits are required.

Synchronous busses use a clock signal, a square wave, that rises and falls at a very regular, specific rate. All of the other signals are synchronized to this signal; their values are typically sampled at a rising- or falling-edge of the clock.

Because the receiving end has a good clock reference, synchronous busses can run at a much higher rate.

Most on-board, chip-to-chip busses are synchronous, including: