If you don’t know what UUIDs are, today we are going to explain them to you, in addition to knowing how they work, based on those that interact with the records of a database, although it can be applied to any unique ID generation.
UUID is short for Universal Unique Identifier, which means that we are looking at a machine-generated unique identifier of a certain range.
This means that two correctly generated UUIDs have virtually no chance of being identical, even if they are created in different places and environments. That is why we speak of a unique identifier.
To ensure that it is unique, elements are defined that include the MAC address of the network card of the computer where it is created, the timestamp, the name space, the random number (although sometimes it is pseudo-random) and other elements, as well as the algorithm to generate the UUID itself from everything that we have told you about.
All this complexity means that it can only be generated by a computer, if what we want is to guarantee its singularity, that is, that it is unique at all.
None can be specified manually. This makes sense, since no normal user should know what objects are associated with a UUID, because if they could they would risk duplicating them and rendering them unusable.
What we must be clear about is that the main objective of an algorithm defined by the UUID generation is that it be unique, singular and totally trustworthy.
UUID: Versions
A UUID is a 16 byte 128 bit number, which will usually be represented by a 36-byte string. Letters are expressed in hexadecimal.
We can say that it is made up of several parts:
- Timestamp and version number UUID they are divided into three segments and occupy 16 characters (60 bit and 4 bit).
- The clock sequence number occupies 4 characters (13 bits + 3 bits).
- While the ID occupies 12 characters (48 bits).
UUIDs have several versions with different algorithms and ranges.
These versions are the following:
- Version 1: A timestamp, a clock sequence, and a value that is specific to the generating device (typically the MAC address) are combined to produce an output that is unique.
- version 2: in this case we are talking about an evolution of version 1 to use it in a specific computing environment such as DCE.
- version 3: If we use MD5 to encode a “namespace” and a “name”, we will create a value that is unique to that name within the namespace. The big difference with others is that, if we generate a UUID with the same namespace and the same name, it will generate one completely the same, so we talk about a method where if there can be two equals (it only happens here).
- version 4: the most used today, since it uses the source of random or pseudo-random numbers of the host to emit its values. This ensures that it is virtually impossible for any two UUIDs to be the same.
- version 5: It is very similar to version 3 except that it uses the SHA-1 algorithm, which is stronger for encryption.
We use the version that we use, the identifier that we will see will be practically the same, except for the details that are in each version, being the number 3 and 5 the most usedabove all the number three for its random character.
An example of a UUID string would be:
16763be4-6022-406e-a950-fcd5018633ca
As you can see the value is represented as five groups of separate alphanumeric characters by scripts. The hyphens are not a required component of the string, since all they do is make the identifier much easier for the human eye to perceive.
UUID USE
UUIDs are often used to create decentralized unique identifiers. They are generated as safe in back-end code, a client device, or your database engine.
Previously, most applications used an auto-incrementing integer field as the primary key. That caused that the ID could not be known until it was in the database. Thanks to UUIDs, it can be identified much earlier.
Let’s see a demo of PHP without using UUID.
The property has to be referenced with null, since the real ID cannot be known, because this has to happen after entering the database.
If we want to solve it, we just have to change to UUID to solve the problem.
Now the identifiers can be generated without problems, knowing them and, therefore, not running the risk of duplication.
This ensures that object instances always represent a valid state and do not need ID properties, in addition to ease the management of secondary records that need a reference to their parent element (such as Author) can be inserted on the fly, without having to wait for a base of all to be inserted.
UUIDs also help you combine data from various sources. They are unique and that makes them the best options for replicated structures and data that is frequently moved between different storage systems.
Still, UUIDs have certain drawbacks such as are four times larger than integerswhich can be a handicap in large data sets.
Another drawback is that the values are complicated to sort and index, especially on the more common random UUIDs. This can be detrimental to performance in some cases.
In any case, this last thing that we tell you can be mitigated if UUIDs are used as binary data, as it happens in databases such as PostgreSQL either mysqlwhere a UUID string can be converted to binary.
We hope that after everything you have read you understand better what a UUID is, how it works and the possibilities it has, along with its advantages and disadvantages.
George is Digismak’s reported cum editor with 13 years of experience in Journalism