ObjectID介绍
MongoDB中的ObjectId是一种特殊的12字节 BSON 类型数据,用于为主文档提供唯一的标识符,默认情况下作为 _id 字段的默认值出现在每一个MongoDB集合中的文档中。以下是ObjectId的具体组成:
1. 时间戳(Timestamp):
- 前4个字节(32位)表示创建该ObjectId时的Unix时间戳,精确到秒,从1970年1月1日UTC时间零点开始计算,这使得ObjectId具有一定程度的时间有序性。
2. 机器标识符(Machine ID):
- 接下来的3个字节(24位)代表了生成此ObjectId的机器主机的唯一标识符。这个标识符通常是基于主机的网络接口地址哈希得到的,目的是确保不同主机生成的ObjectId是不同的。
3. 进程标识符(PID):
- (旧版描述中提到的是进程ID,但在MongoDB较新版本中已不再使用)在某些早期的描述中提及2个字节代表进程ID,不过实际上MongoDB并不使用进程ID来生成ObjectId,以避免因为PID重用导致的冲突。现在这部分数据通常用于其他目的以保证全局唯一性。
4. 计数器(Counter):
- 最后的3个字节(24位)是一个自增计数器,在同一台机器同一秒内生成的ObjectId会通过这个计数器递增来确保唯一性。计数器在一个秒内是从一个随机数开始递增的,这样即使在同一秒内创建多个ObjectId也能保证在单机上的唯一性。
因此,ObjectId的设计可以确保在分布式的环境下,每个文档都能拥有一个全局唯一的标识符,同时也包含了时间信息,这对于很多应用场景来说非常有用,比如排序、索引和逻辑处理。
ObjectID使用
分布式系统需要全局唯一ID且有序的,可以考虑ObjectID。
UUID太长了,且是无序的。感觉不太好,ObjectID算是个还可以的选择。当然还有很多其它方案。
Go项目,在Mongodb的驱动包里,有一个文件是objectid.go,有写好ObjectID生成算法。如果项目只要一个算法没必要引入完整的包,可以直接把这个文件拷贝出来。
内容如下:
Go
package hobjectid
import (
"crypto/rand"
"encoding"
"encoding/binary"
"encoding/hex"
"encoding/json"
"errors"
"fmt"
"io"
"sync/atomic"
"time"
)
// 代码来自 https://github.com/mongodb/mongo-go-driver/blob/v1/bson/primitive/objectid.go
// ErrInvalidHex indicates that a hex string cannot be converted to an ObjectID.
var ErrInvalidHex = errors.New("the provided hex string is not a valid ObjectID")
// ObjectID is the BSON ObjectID type.
type ObjectID [12]byte
// NilObjectID is the zero value for ObjectID.
var NilObjectID ObjectID
var objectIDCounter = readRandomUint32()
var processUnique = processUniqueBytes()
var _ encoding.TextMarshaler = ObjectID{}
var _ encoding.TextUnmarshaler = &ObjectID{}
// NewObjectID generates a new ObjectID.
func NewObjectID() ObjectID {
return NewObjectIDFromTimestamp(time.Now())
}
// NewObjectIDFromTimestamp generates a new ObjectID based on the given time.
func NewObjectIDFromTimestamp(timestamp time.Time) ObjectID {
var b [12]byte
binary.BigEndian.PutUint32(b[0:4], uint32(timestamp.Unix()))
copy(b[4:9], processUnique[:])
putUint24(b[9:12], atomic.AddUint32(&objectIDCounter, 1))
return b
}
// Timestamp extracts the time part of the ObjectId.
func (id ObjectID) Timestamp() time.Time {
unixSecs := binary.BigEndian.Uint32(id[0:4])
return time.Unix(int64(unixSecs), 0).UTC()
}
// Hex returns the hex encoding of the ObjectID as a string.
func (id ObjectID) Hex() string {
var buf [24]byte
hex.Encode(buf[:], id[:])
return string(buf[:])
}
func (id ObjectID) String() string {
return fmt.Sprintf("ObjectID(%q)", id.Hex())
}
// IsZero returns true if id is the empty ObjectID.
func (id ObjectID) IsZero() bool {
return id == NilObjectID
}
// ObjectIDFromHex creates a new ObjectID from a hex string. It returns an error if the hex string is not a
// valid ObjectID.
func ObjectIDFromHex(s string) (ObjectID, error) {
if len(s) != 24 {
return NilObjectID, ErrInvalidHex
}
var oid [12]byte
_, err := hex.Decode(oid[:], []byte(s))
if err != nil {
return NilObjectID, err
}
return oid, nil
}
// IsValidObjectID returns true if the provided hex string represents a valid ObjectID and false if not.
//
// Deprecated: Use ObjectIDFromHex and check the error instead.
func IsValidObjectID(s string) bool {
_, err := ObjectIDFromHex(s)
return err == nil
}
// MarshalText returns the ObjectID as UTF-8-encoded text. Implementing this allows us to use ObjectID
// as a map key when marshalling JSON. See https://pkg.go.dev/encoding#TextMarshaler
func (id ObjectID) MarshalText() ([]byte, error) {
return []byte(id.Hex()), nil
}
// UnmarshalText populates the byte slice with the ObjectID. Implementing this allows us to use ObjectID
// as a map key when unmarshalling JSON. See https://pkg.go.dev/encoding#TextUnmarshaler
func (id *ObjectID) UnmarshalText(b []byte) error {
oid, err := ObjectIDFromHex(string(b))
if err != nil {
return err
}
*id = oid
return nil
}
// MarshalJSON returns the ObjectID as a string
func (id ObjectID) MarshalJSON() ([]byte, error) {
return json.Marshal(id.Hex())
}
// UnmarshalJSON populates the byte slice with the ObjectID. If the byte slice is 24 bytes long, it
// will be populated with the hex representation of the ObjectID. If the byte slice is twelve bytes
// long, it will be populated with the BSON representation of the ObjectID. This method also accepts empty strings and
// decodes them as NilObjectID. For any other inputs, an error will be returned.
func (id *ObjectID) UnmarshalJSON(b []byte) error {
// Ignore "null" to keep parity with the standard library. Decoding a JSON null into a non-pointer ObjectID field
// will leave the field unchanged. For pointer values, encoding/json will set the pointer to nil and will not
// enter the UnmarshalJSON hook.
if string(b) == "null" {
return nil
}
var err error
switch len(b) {
case 12:
copy(id[:], b)
default:
// Extended JSON
var res interface{}
err := json.Unmarshal(b, &res)
if err != nil {
return err
}
str, ok := res.(string)
if !ok {
m, ok := res.(map[string]interface{})
if !ok {
return errors.New("not an extended JSON ObjectID")
}
oid, ok := m["$oid"]
if !ok {
return errors.New("not an extended JSON ObjectID")
}
str, ok = oid.(string)
if !ok {
return errors.New("not an extended JSON ObjectID")
}
}
// An empty string is not a valid ObjectID, but we treat it as a special value that decodes as NilObjectID.
if len(str) == 0 {
copy(id[:], NilObjectID[:])
return nil
}
if len(str) != 24 {
return fmt.Errorf("cannot unmarshal into an ObjectID, the length must be 24 but it is %d", len(str))
}
_, err = hex.Decode(id[:], []byte(str))
if err != nil {
return err
}
}
return err
}
func processUniqueBytes() [5]byte {
var b [5]byte
_, err := io.ReadFull(rand.Reader, b[:])
if err != nil {
panic(fmt.Errorf("cannot initialize objectid package with crypto.rand.Reader: %w", err))
}
return b
}
func readRandomUint32() uint32 {
var b [4]byte
_, err := io.ReadFull(rand.Reader, b[:])
if err != nil {
panic(fmt.Errorf("cannot initialize objectid package with crypto.rand.Reader: %w", err))
}
return (uint32(b[0]) << 0) | (uint32(b[1]) << 8) | (uint32(b[2]) << 16) | (uint32(b[3]) << 24)
}
func putUint24(b []byte, v uint32) {
b[0] = byte(v >> 16)
b[1] = byte(v >> 8)
b[2] = byte(v)
}
使用生成算法,生成的ID 可以与环境无关、业务无关。通用性更好。