db_open
NAME
db_open - database access methods
SYNOPSIS
#include <db.h>
int
db_open(const char *file, DBTYPE type,
int flags, int mode, DB_ENV *dbenv, DB_INFO *dbinfo, DB **dbpp);
DESCRIPTION
The DB library is a family of groups of functions that
provides a modular programming interface to transactions
and record-oriented file access. The library includes
support for transactions, locking, logging and file page
caching, as well as various indexed access methods. Many
of the functional groups (e.g., the file page caching
functions) are useful independent of the other DB func-
tions, although some functional groups are explicitly
based on other functional groups (e.g., transactions and
logging). For a general description of the DB package,
see db_intro(3).
This manual page describes the overall structure of the DB
library access methods.
The currently supported file formats are btree, hashed and
recno. The btree format is a representation of a sorted,
balanced tree structure. The hashed format is an extensi-
ble, dynamic hashing scheme. The recno format supports
fixed or variable length records (optionally retrieved
from a flat text file).
Storage and retrieval for the DB access methods are based
on key/data pairs, or DBT structures as they are typedef'd
in the <db.h> include file. See db_dbt(3) for specific
information on the structure and capabilities of a DBT.
The db_open function opens the database represented by
file for both reading and writing. Files never intended
to be shared or preserved on disk may be created by set-
ting the file parameter to NULL.
The db_open function copies a pointer to a DB structure
(as typedef'd in the <db.h> include file), into the memory
location referenced by dbpp. This structure includes a
set of functions to perform various database actions, as
described below. The db_open function returns the value
of errno on failure and 0 on success.
Note, while most of the access methods use file as the
name of an underlying file on disk, this is not guaran-
teed. Also, calling db_open is a reasonably expensive
operation. (This is based on a model where the DBMS keeps
a set of files open for a long time rather than opening
and closing them on each query.)
The type argument is of type DBTYPE (as defined in the
<db.h> include file) and must be set to one of DB_BTREE,
DB_HASH, DB_RECNO or DB_UNKNOWN. If type is DB_UNKNOWN,
the database must already exist and db_open will then
determine if it is of type DB_BTREE, DB_HASH or DB_RECNO.
The flags and mode arguments specify how files will be
opened and/or created when they don't already exist. The
flags value is specified by or'ing together one or more of
the following values:
DB_CREATE
Create any underlying files, as necessary. If the
files do not already exist and the DB_CREATE flag is
not specified, the call will fail.
DB_NOMMAP
Do not map this file (see db_mpool(3) for further
information).
DB_RDONLY
Open the database for reading only. Any attempt to
write the database using the access methods will fail
regardless of the actual permissions of any underly-
ing files.
DB_THREAD
Cause the DB handle returned by the db_open function
to be useable by multiple threads within a single
address space, i.e., to be ``free-threaded''.
DB_TRUNCATE
``Truncate'' the database if it exists, i.e., behave
as if the database were just created, discarding any
previous contents.
All files created by the access methods are created with
mode mode (as described in chmod(2)) and modified by the
process' umask value at the time of creation (see
umask(2)). The group ownership of created files is based
on the system and directory defaults, and is not further
specified by DB.
DB_ENV
The access methods make calls to the other subsystems in
the DB library based on the dbenv argument to db_open,
which is a pointer to a structure of type DB_ENV (type-
def'd in <db.h>). It is expected that applications will
use a single DB_ENV structure as the argument to all of
the subsystems in the DB package. In order to ensure com-
patibility with future releases of DB, all fields of the
DB_ENV structure that are not explicitly set should be
initialized to 0 before the first time the structure is
used. Do this by declaring the structure external or
static, or by calling the C library routine bzero(3) or
memset(3).
The fields of DB_ENV used by db_open are described below.
As references to the DB_ENV structure may be maintained by
db_open, it is necessary that the DB_ENV structure and
memory it references be valid until after the close func-
tion is called. If dbenv is NULL or any of its fields are
set to 0, defaults appropriate for the system are used
where possible.
The following DB_ENV fields may be initialized before
calling db_open:
DB_LOG *lg_info;
If modifications to the file being opened should be
logged, the lg_info field contains a return value
from the function log_open. If lg_info is NULL, no
logging is done by the DB access methods.
DB_LOCKTAB *lk_info;
If locking is required for the file being opened (as
is the case when multiple processes or threads are
accessing the same file), the lk_info field contains
a return value from the function lock_open. If
lk_info is NULL, no locking is done by the DB access
methods.
If both locking and transactions are being performed
(i.e., both lk_info and tx_info are non-NULL), the
transaction ID will be used as the locker ID. If
only locking is being performed, db_open will acquire
a locker ID from lock_id(3), and will use it for all
locks required for this instance of db_open.
DB_MPOOL *mp_info;
If the cache for the file being opened should be
maintained in a shared buffer pool, the mp_info field
contains a return value from the function memp_open.
If mp_info is NULL, a memory pool may still be cre-
ated by DB, but it will be private to the application
and managed by DB.
DB_TXNMGR *tx_info;
If the accesses to the file being opened should take
place in the context of transactions (providing atom-
icity and error recovery), the tx_info field contains
a return value from the function txn_open (see
db_txn(3)). If transactions are specified, the
application is responsible for making suitable calls
to txn_begin, txn_abort, and txn_commit. If tx_info
is NULL, no transaction support is done by the DB
access methods.
When the access methods are used in conjunction with
transactions, the application must abort the transac-
tion (using txn_abort) if any of the transaction pro-
tected access method calls (i.e., any calls other
than open, close and sync) returns a system error
(e.g., deadlock, which returns EAGAIN). As described
by db_intro(3), a system error is any value greater
than 0.
DB_INFO
The access methods are configured using the DB_INFO data
structure argument to db_open. The DB_INFO structure is
typedef'd in <db.h> and has a large number of fields, most
specific to a single access method, although a few are
shared. The fields that are common to all access methods
are listed here; those specific to an individual access
method are described below. No reference to the DB_INFO
structure is maintained by DB, so it is possible to dis-
card it as soon as the db_open call returns.
In order to ensure compatibility with future releases of
DB, all fields of the DB_INFO structure should be initial-
ized to 0 before the structure is used. Do this by
declaring the structure external or static, or by calling
the C library function bzero(3) or memset(3).
If possible, defaults appropriate for the system are used
for the DB_INFO fields if dbinfo is NULL or any fields of
the DB_INFO structure are set to 0. The following DB_INFO
fields may be initialized before calling db_open:
size_t db_cachesize;
A suggested maximum size of the memory pool cache, in
bytes. If db_cachesize is 0, an appropriate default
is used. If the mp_info field is also specified,
this field is ignored.
Note, the minimum number of pages in the cache should
be no less than 10, and the access methods will fail
if an insufficiently large cache is specified. In
addition, for applications that exhibit strong local-
ity in their data access patterns, increasing the
size of the cache can significantly improve applica-
tion performance.
int db_lorder;
The byte order for integers in the stored database
metadata. The number should represent the order as
an integer, for example, big endian order is the num-
ber 4,321, and little endian order is the number
1,234. If db_lorder is 0, the host order of the
machine where the DB library was compiled is used.
The value of db_lorder is ignored except when
databases are being created. If a database already
exists, the byte order it uses is determined when the
file is read.
The access methods provide no guarantees about the
byte ordering of the application data stored in the
database, and applications are responsible for main-
taining any necessary ordering.
size_t db_pagesize;
The size of the pages used to hold items in the
database, in bytes. The minimum page size is 512
bytes and the maximum page size is 64K bytes. If
db_pagesize is 0, a page size is selected based on
the underlying filesystem I/O block size. The
selected size has a lower limit of 512 bytes and an
upper limit of 16K bytes.
void *(*db_malloc)(size_t);
The flag DB_DBT_MALLOC, when specified in the DBT
structure, will cause the DB library to allocate mem-
ory which then becomes the responsibility of the
calling application. See db_dbt(3) for more informa-
tion.
On systems where separate heaps are maintained for
applications and libraries (notably Windows NT),
specifying the DB_DBT_MALLOC flag will fail because
the DB library will allocate memory from a different
heap than the application will use to free it. To
avoid this problem, the db_malloc field should be set
to point to the application's allocation routine. If
db_malloc is non-NULL, it will be used to allocate
the memory returned when the DB_DBT_MALLOC flag is
set. The db_malloc function must match the calling
conventions of the malloc(3) library routine.
BTREE
The btree data structure is a sorted, balanced tree struc-
ture storing associated key/data pairs. Searches, inser-
tions, and deletions in the btree will all complete in O
(lg base N) where base is the average number of keys per
page. Often, inserting ordered data into btrees results
in pages that are half-full. This implementation has been
modified to make ordered (or inverse ordered) insertion
the best case, resulting in nearly perfect page space uti-
lization.
Space freed by deleting key/data pairs from the database
is never reclaimed from the filesystem, although it is
reused where possible. This means that the btree storage
structure is grow-only. If sufficiently many keys are
deleted from a tree that shrinking the underlying database
file is desirable, this can be accomplished by creating a
new tree from a scan of the existing one.
The following additional fields and flags may be initial-
ized in the DB_INFO structure before calling db_open, when
using the btree access method:
int (*bt_compare)(const DBT *, const DBT *);
Compare is the key comparison function. It must
return an integer less than, equal to, or greater
than zero if the first key argument is considered to
be respectively less than, equal to, or greater than
the second key argument. The same comparison func-
tion must be used on a given tree every time it is
opened. If compare is NULL, the keys are compared
lexically, with shorter keys collating before longer
keys.
int bt_minkey;
The minimum number of keys that will be stored on any
single page. This value is used to determine which
keys will be stored on overflow pages, i.e. if a key
or data item is larger than the pagesize divided by
the minkey value, it will be stored on overflow pages
instead of in the page itself. The bt_minkey value
specified must be at least 2; if bt_minkey is 0, a
value of 2 is used.
size_t (*bt_prefix)(const DBT *, const DBT *);
Prefix is the prefix comparison function. If speci-
fied, this function must return the number of bytes
of the second key argument that are necessary to
determine that it is greater than the first key argu-
ment. If the keys are equal, the key length should
be returned.
This is used to compress the keys stored on the btree
internal pages. The usefulness of this is data
dependent, but in some data sets can produce signifi-
cantly reduced tree sizes and search times. If
bt_prefix is NULL, and no comparison function is
specified, a default lexical comparison function is
used. If bt_prefix is NULL and a comparison function
is specified, no prefix comparison is done.
unsigned long flags;
The following additional flags may be specified by
or'ing together one or more of the following values:
DB_DUP
Permit duplicate keys in the tree, i.e. inser-
tion when the key of the key/data pair being
inserted already exists in the tree will be suc-
cessful. The ordering of duplicates in the tree
is determined by the order of insertion, unless
the ordering is otherwise specified by use of a
cursor (see db_cursor(3) for more information.)
It is an error to specify both DB_DUP and
DB_RECNUM.
DB_RECNUM
Support retrieval from btrees using record num-
bers. For more information, see the DB_GETREC
flag to the db->get function (below), and the
cursor c_get function (in db_cursor(3)).
Logical record numbers in btrees are mutable in
the face of record insertion or deletion. See
the DB_RENUMBER flag in the RECNO section below
for further discussion.
Maintaining record counts within a btree intro-
duces a serious point of contention, namely the
page locations where the record counts are
stored. In addition, the entire tree must be
locked during both insertions and deletions,
effectively single-threading the tree for those
operations. Specifying DB_RECNUM can result in
serious performance degradation for some appli-
cations and data sets.
It is an error to specify both DB_DUP and
DB_RECNUM.
HASH
The hash data structure is an extensible, dynamic hashing
scheme. Backward compatible interfaces to the functions
described in dbm(3), ndbm(3) and hsearch(3) are provided,
however these interfaces are not compatible with previous
file formats.
The following additional fields and flags may be initial-
ized in the DB_INFO structure before calling db_open, when
using the hash access method:
unsigned int h_ffactor;
Ffactor indicates a desired density within the hash
table. It is an approximation of the number of keys
allowed to accumulate in any one bucket, determining
when the hash table grows or shrinks. The default
value is 0, indicating that the fill factor will be
selected dynamically as pages are filled.
u_int32_t (*h_hash)(const void *, u_int32_t);
The h_hash field is a user defined hash function; if
h_hash is NULL, a default hash function is used.
Since no hash function performs equally well on all
possible data, the user may find that the built-in
hash function performs poorly with a particular data
set. User specified hash functions must take a
pointer to a byte string and a length as arguments
and return a u_int32_t value.
If a hash function is specified, hash_open will
attempt to determine if the hash function specified
is the same as the one with which the database was
created, and will fail if it detects that it is not.
unsigned int h_nelem;
An estimate of the final size of the hash table. If
not set or set too low, hash tables will expand
gracefully as keys are entered, although a slight
performance degradation may be noticed. The default
value is 1.
unsigned long flags;
The following additional flags may be specified by
or'ing together one or more of the following values:
DB_DUP
Permit duplicate keys in the tree, i.e. inser-
tion when the key of the key/data pair being
inserted already exists in the tree will be suc-
cessful. The ordering of duplicates in the tree
is determined by the order of insertion, unless
the ordering is otherwise specified by use of a
cursor (see db_cursor(3) for more information.)
RECNO
The recno access method provides support for fixed and
variable length records, optionally backed by a flat text
(byte stream) file. Both fixed and variable length
records are accessed by their logical record number.
It is valid to create a record whose record number is more
than one greater than the last record currently in the
database. For example, the creation of record number 8,
when records 6 and 7 do not yet exist, is not an error.
However, any attempt to retrieve such records (e.g.,
records 6 and 7) will return DB_KEYEMPTY.
Deleting a record will not, by default, renumber records
following the deleted record (see DB_RENUMBER below for
more information). Any attempt to retrieve deleted
records will return DB_KEYEMPTY.
The following additional fields and flags may be initial-
ized in the DB_INFO structure before calling db_open, when
using the recno access method:
int re_delim;
For variable length records, if the re_source file is
specified and the DB_DELIMITER flag is set, the
delimiting byte used to mark the end of a record in
the source file. If the re_source file is specified
and the DB_DELIMITER flag is not set, <newline> char-
acters (i.e. ``\n'', 0x0a) are interpreted as end-of-
record markers.
u_int32_t re_len;
The length of a fixed-length record.
int re_pad;
For fixed length records, if the DB_PAD flag is set,
the pad character for short records. If the DB_PAD
flag is not set, <space> characters (i.e., 0x20) are
used for padding.
char *re_source;
The purpose of the re_source field is to provide fast
access and modification to databases that are nor-
mally stored as flat text files.
If the re_source field is non-NULL, it specifies an
underlying flat text database file that is read to
initialize a transient record number index. In the
case of variable length records, the records are sep-
arated by the byte value re_delim. For example,
standard UNIX byte stream files can be interpreted as
a sequence of variable length records separated by
<newline> characters.
In addition, when cached data would normally be writ-
ten back to the underlying database file (e.g., the
close or sync functions are called), the in-memory
copy of the database will be written back to the
re_source file.
By default, the backing source file is read lazily,
i.e., records are not read from the file until they
are requested by the application. If multiple pro-
cesses (not threads) are accessing a recno database
concurrently and either inserting or deleting
records, the backing source file must be read in its
entirety before more than a single process accesses
the database, and only that process should specify
the backing source file as part of the db_open call.
See the DB_SNAPSHOT flag below for more information.
Reading and writing the backing source file specified
by re_source cannot be transactionally protected
because it involves filesystem operations that are
not part of the DB transaction methodology. For this
reason, if a temporary database is used to hold the
records, i.e., a NULL was specified as the file argu-
ment to db_open, it is possible to lose the contents
of the re_source file, e.g., if the system crashes at
the right instant. If a file is used to hold the
database, i.e., a file name was specified as the file
argument to db_open, normal database recovery on that
file can be used to prevent information loss,
although it is still possible that the contents of
re_source will be lost if the system crashes.
The re_source file must already exist (but may be
zero-length) when db_open is called.
For all of the above reasons, the re_source field is
generally used to specify databases that are read-
only for DB applications, and that are either gener-
ated on the fly by software tools, or modified using
a different mechanism, e.g., a text editor.
unsigned long flags;
The following additional flags may be specified by
or'ing together one or more of the following values:
DB_DELIMITER
The re_delim field is set.
DB_FIXEDLEN
The records are fixed-length, not byte delim-
ited. The structure element re_len specifies
the length of the record, and the structure ele-
ment re_pad is used as the pad character.
Any records added to the database that are less
than re_len bytes long are automatically padded.
Any attempt to insert records into the database
that are greater than re_len bytes long will
cause the call to fail immediately and return an
error.
DB_PAD
The re_pad field is set.
DB_RENUMBER
Specifying the DB_RENUMBER flag causes the logi-
cal record numbers to be mutable, and change as
records are added to and deleted from the
database. For example, the deletion of record
number 4 causes records numbered 5 and greater
to be renumbered downward by 1. If a cursor was
positioned to record number 4 before the dele-
tion, it will reference the new record number 4,
if any such record exists, after the deletion.
If a cursor was positioned after record number 4
before the deletion, it will be shifted downward
1 logical record, continuing to reference the
same record as it did before.
Using the c_put or put interfaces to create new
records will cause the creation of multiple
records if the record number is more than one
greater than the largest record currently in the
database. For example, creating record 28, when
record 25 was previously the last record in the
database, will create records 26 and 27 as well
as 28. Attempts to retrieve records that were
created in this manner will result in an error
return of DB_KEYEMPTY.
If a created record is not at the end of the
database, all records following the new record
will be automatically renumbered upward by 1.
For example, the creation of a new record num-
bered 8 causes records numbered 8 and greater to
be renumbered upward by 1. If a cursor was
positioned to record number 8 or greater before
the insertion, it will be shifted upward 1 logi-
cal record, continuing to reference the same
record as it did before.
For these reasons, concurrent access to a recno
database with the DB_RENUMBER flag specified may
be largely meaningless, although it is sup-
ported.
DB_SNAPSHOT
This flag specifies that any specified re_source
file be read in its entirety when db_open is
called. If this flag is not specified, the
re_source file may be read lazily.
DB OPERATIONS
The DB structure returned by db_open describes a database
type, and includes a set of functions to perform various
actions, as described below. Each of these functions
takes a pointer to a DB structure, and may take one or
more DBT *'s and a flag value as well. The fields of the
DB structure are as follows:
DBTYPE type;
The type of the underlying access method (and file
format). Set to one of DB_BTREE, DB_HASH or
DB_RECNO. This field may be used to determine the
type of the database after a return from db_open with
the type argument set to DB_UNKNOWN.
int (*close)(DB *db, int flags);
A pointer to a function to flush any cached
information to disk, close any open cursors (see
db_cursor(3)), free any allocated resources, and
close any underlying files. Since key/data pairs are
cached in memory, failing to sync the file with the
close or sync function may result in inconsistent or
lost information.
The flags parameter must be set to 0 or the following
value:
DB_NOSYNC
Do not flush cached information to disk.
The DB_NOSYNC flag is a dangerous option. It should
only be set if the application is doing logging (with
or without transactions) so that the database is
recoverable after a system or application crash, or
if the database is always generated from scratch
after any system or application crash.
It is important to understand that flushing cached
information to disk only minimizes the window of
opportunity for corrupted data. While unlikely, it
is possible for database corruption to happen if a
system or application crash occurs while writing data
to the database. To ensure that database corruption
never occurs, applications must either: use transac-
tions and logging with automatic recovery, use log-
ging and application-specific recovery, or edit a
copy of the database, and, once all applications
using the database have successfully called close,
replace the original database with the updated copy.
When multiple threads are using the DB handle concur-
rently, only a single thread may call the DB handle
close function.
The close function returns the value of errno on
failure and 0 on success.
int (*cursor)(DB *db, DB_TXN *txnid, DBC **cursorp);
A pointer to a function to create a cursor and copy a
pointer to it into the memory referenced by cursorp.
A cursor is a structure used to provide sequential
access through a database. This interface and its
associated functions replaces the functionality pro-
vided by the seq function in previous releases of the
DB library.
If the file is being accessed under transaction pro-
tection, the txnid parameter is a transaction ID
returned from txn_begin, otherwise, NULL. If
transaction protection is enabled, cursors must be
opened and closed within the context of a transac-
tion, and the txnid parameter specifies the transac-
tion context in which the cursor may be used. See
db_cursor(3) for more information.
The cursor function returns the value of errno on
failure and 0 on success.
int (*del)(DB *db, DB_TXN *txnid, DBT *key, int flags);
A pointer to a function to remove key/data pairs from
the database. The key/data pair associated with the
specified key is discarded from the database. In the
presence of duplicate key values, all records associ-
ated with the designated key will be discarded.
If the file is being accessed under transaction pro-
tection, the txnid parameter is a transaction ID
returned from txn_begin, otherwise, NULL.
The flags parameter is currently unused, and must be
set to 0.
The del function returns the value of errno on fail-
ure, 0 on success, and DB_NOTFOUND if the specified
key did not exist in the file.
int (*fd)(DB *db, int *fdp);
A pointer to a function that copies a file descriptor
representative of the underlying database into the
memory referenced by fdp. A file descriptor refer-
encing the same file will be returned to all pro-
cesses that call db_open with the same file argument.
This file descriptor may be safely used as an argu-
ment to the fcntl(2) and flock(2) locking functions.
The file descriptor is not necessarily associated
with any of the underlying files used by the access
method.
The fd function was introduced in early versions of
DB, before the lock manager was added, to support a
coarse-grained form of locking. Applications should
be converted to use the lock manager where possible,
and this interface should not be used by new applica-
tions.
The fd function returns the value of errno on failure
and 0 on success.
int (*get)(DB *db, DB_TXN *txnid,
DBT *key, DBT *data, int flags);
A pointer to a function that is an interface for
keyed retrieval from the database. The address and
length of the data associated with the specified key
are returned in the structure referenced by data.
In the presence of duplicate key values, get will
return the first data item for the designated key.
Duplicates are sorted by insert order except where
this order has been overwritten by cursor operations.
Retrieval of duplicates requires the use of cursor
operations. See db_cursor(3) for details.
If the file is being accessed under transaction pro-
tection, the txnid parameter is a transaction ID
returned from txn_begin, otherwise, NULL.
The flags parameter must be set to 0 or the following
value:
DB_GETREC
Retrieve a specific numbered record from a
database. Upon return, both the key and data
items will have been filled in, not just the
data item as is done for all other uses of the
get function.
For DB_GETREC to be specified, the underlying
database must be of type btree, and it must have
been created with the DB_RECNUM flag (see
db_open(3)). In this case, the data field of
the key must be a pointer to a memory location
of type db_recno_t, as described in db_dbt(3).
If the database is a recno database and the requested
key exists, but was never explicitly created by the
application or was later deleted, the get function
returns DB_KEYEMPTY. Otherwise, if the requested key
isn't in the database, the get function returns
DB_NOTFOUND. Otherwise, the get function returns the
value of errno on failure and 0 on success.
int (*put)(DB *db, DB_TXN *txnid,
DBT *key, DBT *data, int flags);
A pointer to a function to store key/data pairs in
the database. If the database supports duplicates,
the put function adds the new data value at the end
of the duplicate set.
If the file is being accessed under transaction pro-
tection, the txnid parameter is a transaction ID
returned from txn_begin, otherwise, NULL.
The flags value is specified by or'ing together one
or more of the following values:
DB_APPEND
Append the key/data pair to the end of the
database. For DB_APPEND to be specified, the
underlying database must be of type recno. The
record number allocated to the record is
returned in the specified key.
DB_NOOVERWRITE
Enter the new key/data pair only if the key does
not already appear in the database.
The default behavior of the put function is to enter
the new key/data pair, replacing any previously
existing key if duplicates are disallowed, or to add
a duplicate entry if duplicates are allowed. Even if
the designated database allows duplicates, a call to
put with the DB_NOOVERWRITE flag set will fail if the
key already exists in the database.
The put function returns the value of errno on fail-
ure, 0 on success, and DB_KEYEXIST if the DB_NOOVER-
WRITE flag was set and the key already exists in the
file.
int (*sync)(DB *db, int flags);
A pointer to a function to flush any cached informa-
tion to disk. If the database is in memory only, the
sync function has no effect and will always succeed.
The flags parameter is currently unused, and must be
set to 0.
See the close function description above for a dis-
cussion of DB and cached data.
The sync function returns the value of errno on fail-
ure and 0 on success.
int (*stat)(DB *db, void *sp,
void *(*db_malloc)(size_t), int flags);
A pointer to a function to create a statistical
structure and copy a pointer to it into user-speci-
fied memory locations. Specifically, if sp is non-
NULL, a pointer to the statistics for the database
are copied into the memory location it references.
Statistical structures are created in allocated mem-
ory. If db_malloc is non-NULL, it is called to allo-
cate the memory, otherwise, the library function mal-
loc(3) is used. The function db_malloc must match
the calling conventions of the malloc(3) library rou-
tine. Regardless, the caller is responsible for
deallocating the returned memory. To deallocate the
returned memory, free each returned memory pointer;
pointers inside the memory do not need to be individ-
ually freed.
In the presence of multiple threads or processes
accessing an active database, the returned informa-
tion may be out-of-date.
This function may access all of the pages in the
database, and therefore may incur a severe perfor-
mance penalty and have obvious negative effects on
the underlying buffer pool.
The flags parameter must be set to 0 or the following
value:
DB_RECORDCOUNT
In the case of a btree or recno database, fill
in the bt_nrecs field, but do not collect any
other information. This flag makes it reason-
able for applications to request a record count
from a database without incurring a performance
penalty.
The stat function returns the value of errno on fail-
ure and 0 on success.
In the case of a btree or recno database, the statis-
tics are stored in a structure of type DB_BTREE_STAT
(typedef'd in <db.h>). The following fields will be
filled in:
u_int32_t bt_flags;
Permanent database flags, including DB_DUP,
DB_FIXEDLEN, DB_RECNUM and DB_RENUMBER.
u_int32_t bt_minkey;
The bt_minkey value specified to db_open(3), if
any.
u_int32_t bt_re_len;
The re_len value specified to db_open(3), if
any.
u_int32_t bt_re_pad;
The re_pad value specified to db_open(3), if
any.
u_int32_t bt_pagesize;
Underlying tree page size.
u_int32_t bt_levels;
Number of levels in the tree.
u_int32_t bt_nrecs;
Number of data items in the tree (since there
may be multiple data items per key, this number
may not be the same as the number of keys).
u_int32_t bt_int_pg;
Number of tree internal pages.
u_int32_t bt_leaf_pg;
Number of tree leaf pages.
u_int32_t bt_dup_pg;
Number of tree duplicate pages.
u_int32_t bt_over_pg;
Number of tree overflow pages.
u_int32_t bt_free;
Number of pages on the free list.
u_int32_t bt_freed;
Number of pages made available for reuse because
they were emptied.
u_int32_t bt_int_pgfree;
Number of bytes free in tree internal pages.
u_int32_t bt_leaf_pgfree;
Number of bytes free in tree leaf pages.
u_int32_t bt_dup_pgfree;
Number of bytes free in tree duplicate pages.
u_int32_t bt_over_pgfree;
Number of bytes free in tree overflow pages.
u_int32_t bt_pfxsaved;
Number of bytes saved by prefix compression.
u_int32_t bt_split;
Total number of tree page splits (includes fast
and root splits).
u_int32_t bt_rootsplit;
Number of root page splits.
u_int32_t bt_fastsplit;
Number of fast splits. When sorted keys are
added to the database, the DB btree implementa-
tion will split left or right to increase the
page-fill factor. This number is a measure of
how often it was possible to make such a split.
u_int32_t bt_added;
Number of keys added.
u_int32_t bt_deleted;
Number of keys deleted.
u_int32_t bt_get;
Number of keys retrieved. (Note, this value
will not reflect any keys retrieved when the
database was open for read-only access, as there
is no permanent location to store the informa-
tion in this case.)
u_int32_t bt_cache_hit;
Number of hits in tree fast-insert code. When
sorted keys are added to the database, the DB
btree implementation will check the last page
where an insert occurred before doing a full
lookup. This number is a measure of how often
the lookup was successful.
u_int32_t bt_cache_miss;
Number of misses in tree fast-insert code. See
the description of bt_cache_hit; this number is
a measure of how often the lookup failed.
ENVIRONMENT VARIABLES
The following environment variables affect the execution
of db_open:
DB_HOME
If the dbenv argument to db_open was initialized
using db_appinit, the environment variable DB_HOME
may be used as the path of the database home for the
interpretation of the dir argument to db_open, as
described in db_appinit(3). Specifically, db_open is
affected by the configuration string value of
DB_DATA_DIR.
COMPILING
On IRIX, if you are compiling a threaded application, you
must compile with the -D_SGI_MP_SOURCE flag:
cc -D_SGI_MP_SOURCE ...
On OSF/1, if you are compiling a threaded application, you
must compile with the -D_REENTRANT flag:
cc -D_REENTRANT ...
On Solaris, if you are compiling a threaded application,
you must compile with the -D_REENTRANT flag and link with
the -lthread library:
cc -D_REENTRANT ... -lthread
EXAMPLES
Applications that create short-lived databases that are
discarded or recreated when the system fails and are
unconcerned with concurrent access and loss of data due to
catastrophic failure, may wish to use the db_open func-
tionality without other parts of the DB library. Such
applications will only be concerned with the DB access
methods. The DB access methods will use the memory pool
subsystem, but the application is unlikely to be aware of
this. See the files example/ex_access.c and exam-
ple/ex_btrec.c in the DB source distribution for C lan-
guage code examples of how such applications might use the
DB library.
ERRORS
The db_open function may fail and return errno for any of
the errors specified for the following DB and library
functions: close(2), fcntl(2), fstat(2), getpid(2),
mmap(2), munmap(2), open(2), read(2), unlink(2), abort(3),
calloc(3), db->sync, fflush(3), free(3), getenv(3),
isdigit(3), lock_get(3), lock_id(3), lock_put(3),
lock_vec(3), log_register(3), log_unregister(3), mal-
loc(3), memcpy(3), memp_close(3), memp_fclose(3),
memp_fget(3), memp_fopen(3), memp_fput(3), memp_fset(3),
memp_fsync(3), memp_open(3), memp_register(3), memset(3),
sigfillset(3), sigprocmask(3), stat(3), strcpy(3),
strdup(3), strerror(3), strlen(3), t->re_irec and
vsnprintf(3).
In addition, the db_open function may fail and return
errno for the following conditions:
[EAGAIN]
A lock was unavailable.
[EINVAL]
An invalid flag value or parameter was specified
(e.g., unknown database type, page size, hash func-
tion, recno pad byte, byte order) or a flag value or
parameter that is incompatible with the current file
specification.
TMPDIR If the dbenv argument to _open was NULL or not
initialized using db_appinit, the environment vari-
able TMPDIR may be used as the directory in which to
create the , as described in the _open section above.
There is a mismatch between the version number of
file and the software.
A re_source file was specified with either the
DB_THREAD flag or a non-NULL tx_info field in the
DB_ENV argument to db_open.
[ENOENT]
A non-existent re_source file was specified.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
The db->close function may fail and return errno for any
of the errors specified for the following DB and library
functions: close(2), fcntl(2), getpid(2), munmap(2),
open(2), unlink(2), abort(3), db->db_malloc, db->sync,
fflush(3), fprintf(3), free(3), getenv(3), isdigit(3),
lock_get(3), lock_put(3), lock_vec(3), log_put(3), mal-
loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
memp_fset(3), memset(3), realloc(3), sigfillset(3), sig-
procmask(3), snprintf(3), stat(3), strcpy(3), strdup(3),
strerror(3), strlen(3) and vsnprintf(3).
The db->cursor function may fail and return errno for any
of the errors specified for the following DB and library
functions: free(3).
In addition, the db->cursor function may fail and return
errno for the following conditions:
[EINVAL]
An invalid flag value or parameter was specified.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
The db->del function may fail and return errno for any of
the errors specified for the following DB and library
functions: db->db_malloc, fflush(3), fprintf(3), free(3),
lock_get(3), lock_put(3), lock_vec(3), log_put(3), mal-
loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
memp_fset(3), memset(3), realloc(3) and vsnprintf(3),
In addition, the db->del function may fail and return
errno for the following conditions:
[EAGAIN]
A lock was unavailable.
[EINVAL]
An invalid flag value or parameter was specified.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
In addition, the db->fd function may fail and return errno
for the following conditions:
[ENOENT]
The db->fd function was called for an in-memory
database, or no underlying file has yet been created.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
The db->get function may fail and return errno for any of
the errors specified for the following DB and library
functions: db->db_malloc, fflush(3), fprintf(3),
lock_get(3), lock_put(3), lock_vec(3), malloc(3), mem-
cpy(3), memp_fget(3), memp_fput(3), realloc(3) and
vsnprintf(3).
In addition, the db->get function may fail and return
errno for the following conditions:
[EAGAIN]
A lock was unavailable.
[EINVAL]
An invalid flag value or parameter was specified.
The DB_THREAD flag was specified to the db_open(3)
function and neither the DB_DBT_MALLOC or
DB_DBT_USERMEM flags were set in the DBT.
A record number of 0 was specified.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
The db->put function may fail and return errno for any of
the errors specified for the following DB and library
functions: db->db_malloc, fflush(3), fprintf(3), free(3),
lock_get(3), lock_put(3), lock_vec(3), log_put(3), mal-
loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
memp_fset(3), memset(3), realloc(3), t->bt_prefix and
vsnprintf(3).
In addition, the db->put function may fail and return
errno for the following conditions:
[EACCES]
An attempt was made to modify a read-only database.
[EAGAIN]
A lock was unavailable.
[EINVAL]
An invalid flag value or parameter was specified.
A record number of 0 was specified.
An attempt was made to add a record to a fixed-length
database that was too large to fit.
An attempt was made to do a partial put.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
[ENOSPC]
A btree exceeded the maximum btree depth (255).
The db->sync function may fail and return errno for any of
the errors specified for the following DB and library
functions: close(2), fcntl(2), open(2), write(2),
abort(3), db->db_malloc, fflush(3), fprintf(3), free(3),
lock_get(3), lock_put(3), lock_vec(3), log_put(3), mal-
loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
memp_fset(3), memp_fsync(3), memset(3), realloc(3), str-
error(3), t->bt_prefix, t->re_irec and vsnprintf(3).
In addition, the db->sync function may fail and return
errno for the following conditions:
[EINVAL]
An invalid flag value or parameter was specified.
[EPERM]
Database corruption was detected. All subsequent
database calls (other than db->close) will return
EPERM.
The db->stat function may fail and return errno for any of
the errors specified for the following DB and library
functions: malloc(3).
SEE ALSO
The Ubiquitous B-tree, Douglas Comer, ACM Comput. Surv.
11, 2 (June 1979), 121-138.
Prefix B-trees, Bayer and Unterauer, ACM Transactions on
Database Systems, Vol. 2, 1 (March 1977), 11-26.
The Art of Computer Programming Vol. 3: Sorting and
Searching, D.E. Knuth, 1968, pp 471-480.
Dynamic Hash Tables, Per-Ake Larson, Communications of the
ACM, April 1988.
A New Hash Package for UNIX, Margo Seltzer, USENIX Pro-
ceedings, Winter 1991.
Document Processing in a Relational Database System,
Michael Stonebraker, Heidi Stettner, Joseph Kalash,
Antonin Guttman, Nadene Lynn, Memorandum No. UCB/ERL
M82/32, May 1982.
db_archive(1), db_checkpoint(1), db_deadlock(1), db_dump(1),
db_intro(3), db_load(1), db_recover(1), db_stat(1),
db_appinit(3), db_cursor(3), db_dbm(3), db_lock(3), db_log(3),
db_mpool(3), db_open(3), db_txn(3)