funpack.exporting_tsv
This module provides the exportTSV() and exportCSV() functions,
which export the data contained in a DataTable to a TSV or CSV file.
- funpack.exporting_tsv.NUM_ROWS = 10000
Default number of rows to export at a time by
exportTSV()- the default value for itsnumRowsargument.
- funpack.exporting_tsv.exportCSV(dtable, outfile, **kwargs)[source]
Export data to a CSV-style file.
This function is identical to the
exportTSV(), except that the default value for the sep` argument is a','instead of a' '.
- funpack.exporting_tsv.exportTSV(dtable, outfile, sep=None, missingValues=None, escapeNewlines=False, numRows=None, dropNaRows=False, dateFormat=None, timeFormat=None, formatters=None, **kwargs)[source]
Export data to a TSV-style file.
This may be parallelised by row - chunks of
numRowsrows will be saved to separate temporary output files in parallel, and then concatenated afterwards to produce the final output file.- Parameters:
dtable –
DataTablecontaining the dataoutfile – File to output to
sep – Separator character to use. Defaults to
'\t'missingValues – String to use for missing/NA values. Defaults to the empty string.
escapeNewlines – If
True, all string/object types are escaped usingshlex.quote.numRows – Number of rows to write at a time. Defaults to
NUM_ROWS.dropNaRows – If
True, rows which do not contain data for any columns are not exported.dateFormat – Name of formatter to use for date columns.
timeFormat – Name of formatter to use for time columns.
formatters – Dict of
{ [vid|column] : formatter }mappings, specifying custom formatters to use for specific variables.
- funpack.exporting_tsv.writeDataFrame(dtable, outfile, header, chunki, sep, missingValues, dropNaRows, dateFormat, timeFormat, formatters)[source]
Writes all of the data in
dtabletooutfile.Called by
exportTSV()to output one chunk of data.- Parameters:
dtable –
DataTablecontaining the dataoutfile – File to output to
header – If
True, write the header row (column names).chunki – Chunk index (used for logging)
sep – Separator character to use.
missingValues – String to use for missing/NA values.
dropNaRows – If
True, rows which do not contain data for any columns are not exported.dateFormat – Name of formatter to use for date columns.
timeFormat – Name of formatter to use for time columns.
formatters – Dict of
{ [vid|column] : formatter }mappings, specifying custom formatters to use for specific variables.