NAME

pgexport - a Perl utility to download ASCII files from PostgreSQL databases


VERSION

This version is 0.9.1, released 2004-02-12

Enhancements in this version

option for XML export for Kugar reporting engine

Bugfixes in this version

fixed overwriting of randomly generated filename

previous versions:

0.9.0 (2004-01-30), initial release


SYNOPSIS

command-line method:

pgexport mode dbhost dbport dbname dbuser dbpass tablename datafile mask_or_delimiter firstrecord specfile fieldlist forcecase quotemark apostrophes

where mode is d or f

or

command-line method, XML output:

pgexport mode dbhost dbport dbname dbuser dbpass tablename datafile implicit_level default_level template temporary forcecase

mode is always k

or

configuration file method:

pgexport c configfile

All arguments are mandatory, but some may be marked as NULL.


DESCRIPTION

pgexport allows for the export of delimited or fixed-width ASCII text from PostgreSQL databases, and for XML output for use in the Kugar reporting engine.

I wrote this because (1) COPY is limited to the superuser (2) \copy is a crippled version of COPY (3) pgadmin3 and pgaccess have broken import/export functionality (at least on Fedora Core 1 - yes, with the updates installed via yum) (4) I am trying to extricate myself from certain proprietary software products from Redmond, Washington. A replacement for a certain desktop database is the last piece of the puzzle - and I am only too happy to share with others.

So, in the great tradition of Open-Source, I rolled my own.

Yes, there is a pgimport.


LICENSE

This software (pgexport) may be used under either the GNU General Public License, version 2 (or at your option, any later version), or the Artistic License.

No warranty or guarantee, either express or implied, exists for this software or for the use of this software. You use this software at your own risk.


DEVELOPMENT

pgexport was developed using Fedora Core 1 (updated via yum to 2004-01-26), PostgreSQL 7.3.4, and Perl 5.8.1 i386-linux-thread-multi.


OPTIONS

command-line method

NOTE: When NULL is an option, the literal string ``NULL'' is meant, without any quotation marks. A ``0'' (zero) can also be substituted where NULL is used (again, no quotation marks).

mode
mode is either d if exporting delimited data, or f if exporting fixed-width data

dbhost
dbhost is the DNS host name or IP address of the PostgreSQL server. If NULL, then localhost is the default.

dbport
dbport is the connection port for the PostgreSQL server. If NULL, then 5432 is the default.

dbname
dbname is the name of the database on the server. No default value is available, so you MUST provide this information.

dbuser
dbuser is the username by which you intend to connect to the database server. No default, but if NULL is passed, you will be prompted for a username from the command line.

dbpass
dbpass is the password by which you intend to connect to the database server, No default, but if NULL is passed, you will be prompted for a password from the command line. If Perl module Term::ReadKey is available, the password will be entered no-echo. If the module is not available, the password will be entered in cleartext.

tablename
tablename is the name of the table or view in the database from which you want to export your data. Note that the table or view must alredy exist. No default value is possible.

datafile
datafile is the name of the file containing the data you want to import. An absolute path is not required, but is best. If NULL is passed, the default is a timestamped filename of the form tablename_export_YYYYMMDD_HHMMSS, which will be saved in the same directory as pgexport. If you do not have write permissions to the directory, you will not get an export file!

mask_or_delimiter
mask_or_delimiter works differently depending on whether mode is d or f.

If mode is d, then the field delimiter is entered here (use \t for tabs). There is NO default delimiter.

If mode is f, then a pack mask may be given here, or NULL passed. (See Perl documentation for functions pack() and unpack()). Pack masks should use only the A template character for ASCII data (i.e. do not use a template character other than A unless you know what you are doing).

firstrecord
If mode is f, this must be NULL.

If mode if d, pass a 1 here to indicate that the datafile should be exported with field names in its first record. If it does not, pass NULL.

specfile
specfile indicates the location of a layout and specification file. The file should contain records in the form recordname,fieldwidth (yes, comma-separated), with each record entry terminated by a newline (so it should ``read down''), although fieldwidth is optional if mode is d.

If no layout and specification file is used, pass NULL.

fieldlist
fieldlist contains a comma-delimited list of field names. NULL may be passed if you are providing field names another way.

forcecase
forcecase can have one of three values: u, l, p.

p preserves the case of all values and fieldnames. This is the default

u forces all values and fieldnames to uppercase.

l forces all values and fieldnames to lowercase.

quotemark
quotemark contains the quoting character you want to use. All data is exported as text. If you do not want a quoting character, pass NULL.

quotemark should be NULL if mode is f.

If you are using pgexport in command-line mode and you want to use the single-quote or double-quote character, you must escape the character with a backslash. However, if you are using configuration file mode, do NOT escape the character with a backslash.

apostrophes
apostrophes controls the export of single-quote and apostrophe characters.

If apostrophes is 1, single-quotes and apostrophes will be preserved. This is the default.

If apostrophes is 0, single-quotes and apostrophes are eliminated from values before insertion, and are therefore lost.

If you want to use the single-quote character in quotemark, then take care that apostrophes is 0.

command-line method, XML output

mode
mode is always k for XML output.

dbhost
dbhost is the DNS host name or IP address of the PostgreSQL server. If NULL, then localhost is the default.

dbport
dbport is the connection port for the PostgreSQL server. If NULL, then 5432 is the default.

dbname
dbname is the name of the database on the server. No default value is available, so you MUST provide this information.

dbuser
dbuser is the username by which you intend to connect to the database server. No default, but if NULL is passed, you will be prompted for a username from the command line.

dbpass
dbpass is the password by which you intend to connect to the database server, No default, but if NULL is passed, you will be prompted for a password from the command line. If Perl module Term::ReadKey is available, the password will be entered no-echo. If the module is not available, the password will be entered in cleartext.

tablename
tablename is the name of the table or view in the database from which you want to export your data. Note that the table or view must alredy exist. No default value is possible.

datafile
datafile is the name of the file to which you want to send the XML data. An absolute path is not required, but is best if not sending the file to /tmp (see entry for temporary). If NULL is passed, the default is a timestamped filename of the form tablename_xmlexport_YYYYMMDD_HHMMSS, which will be saved in your home directory ($HOME). If you do not have write permissions to the directory, you will not get an export file!

Passing a NULL here will cause pgexport to generate a filename.

implicit_level
Kugar allows unlimited detail levels in its reporting. However, you must supply the detail level with each record. You can either arrange to have it come from your table or view, or pgexport can do it for you. If you provide your own report levels, you must have a field called level. If you want pgexport to provide a default level for every record, your table or view must NOT contain a field called level.

If you want pgexport to provide a default level, enter a 1.

If you do NOT want pgexport to provide a default level, but will provide it yourself, enter a 0.

The default is 1 (pgexport provides a default level). Enter NULL to obtain the default.

default_level
If you want pgexport to provide a default level for the rows, enter a default level here. This value must be a nonzero integer.

The default value for default_level is 0. Enter NULL to obtain the default.

template
template is the name of the report template file used by Kugar to generate the report. Kugar template files usually end in .kut.

temporary
temporary indicates whether the XML file should be placed in /tmp. If a filename is provided in datafile, no path information must be indicated, or the file will NOT go into /tmp.

Pass a 1 here to place the data file into /tmp. Pass a 0 here for a ``permanent'' file.

The default is 1.

forcecase
forcecase can have one of three values: u, l, p.

p preserves the case of all values and fieldnames. This is the default

u forces all values and fieldnames to uppercase.

l forces all values and fieldnames to lowercase.

configuration file method

In the configuration file method, mode is c. Instead of reading options from the command line, they are read from configfile. The entries in configfile should appear in the same order, using the same syntax, as for the command-line method, except that a newline should terminate each argument (i.e. the arguments should form a list ``reading down'').


INTERNALS

Array @config holds the information obtained from the command line or from the configuration file. The elements of the array are used as follows. Please see the appropriate entries in section OPTIONS.

$config[0]: mode

$config[1]: dbhost

$config[2]: dbport

$config[3]: dbname

$config[4]: dbuser

$config[5]: dbpass

$config[6]: tablename

$config[7]: datafile

$config[8]: mask_or_delimiter (implicit_level for mode k)

$config[9]: firstrecord (default_level for mode k)

$config[10]: specfile (template for mode k)

$config[11]: fieldlist (temporary for mode k)

$config[12]: forcecase

$config[13]: quotemark (not used for mode k)

$config[14]: apostrophes (not used for mode k)


FILES

Please see section OPTIONS for required fields, specifically datafile and specfile.


DIAGNOSTICS AND GOTCHAS

I am too lazy to go over every error message here. Besides, the ones you will get from STDOUT are descriptive enough.

However, some words of note and caution are in order.

1. All fields are exported as text. If you choose a quoting character, all fields will be quoted.

2. For delimited files, obtaining field names from the database table is NOT the default. The order is: specfile, fieldlist, database table. If you want to use the database table/view field names, make sure specfile and fieldlist are both NULL.

3. Using a specfile or fieldlist for delimited data is not a bad idea - pgexport checks to make sure the number of fields in the table or view matches the number of fields in your filespec or fieldlist. A field count is also always performed for fixed-width records.

4. If you have special data types (i.e. dates, currency) that need special formatting before export, handle that using a database view; pgexport does not perform any data formatting.

5. If a file exists with the same name as datafile, the program will terminate.

6. If you are using the program in command-line mode and want to use the double-quote or single-quote character for quoting, you must escape it first with a backslash. You do not escape characters in configuration file mode.

7. For those of you on Unixlike systems, you may want to run unix2dos after export before sending the file out into the WinDOS world.


=back


REQUIRES

Perl 5.004 or higher, Text::ParseWords, DBI, DBD::Pg

Term::ReadKey is required for no-echo password entry from the command line when prompted. If this module is unavailable, the program will still run, but no-echo entry will not be available (you will have to enter your password in clear text from the command line). See previous entry for dbpass.


SEE ALSO

pack(), unpack(), Text::ParseWords, DBI, DBD:Pg, Term::ReadKey


THANKS

Many thanks to the PostgreSQL development team for a database I can live with (meaning inlined functions using PL/Perl), to Larry Wall and his little helpers for the best utility language in the world, and to the contributors to Fedora Core 1.


AUTHOR

Wayne Matthew Syvinski, matthew@techcelsior.com