Modules/_csv.c
Source:
cpython 3.14 @ ab2d84fe1023/Modules/_csv.c
_csv is the C accelerator for the csv module. It provides reader, writer, and Dialect objects.
Map
| Lines | Symbol | Role |
|---|---|---|
| 1-120 | Dialect | Encapsulate delimiter, quoting, escaping, line terminator |
| 121-300 | csv_reader | Iterate rows from a file-like object; yield lists of strings |
| 301-500 | Reader_iternext | Pull one row from the underlying iterator |
| 501-700 | csv_writer | Write rows to a file-like object |
| 701-900 | Writer_writerow | Format one row and write it |
| 901-1100 | csv.register_dialect / unregister_dialect | Named dialect registry |
| 1101-1400 | QUOTE_* constants | QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE |
Reading
Dialect
// CPython: Modules/_csv.c:280 Dialect_new
static PyObject *
Dialect_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
/* Keyword arguments set the dialect fields:
delimiter (default ',')
doublequote (default True)
escapechar (default None)
lineterminator (default '\r\n')
quotechar (default '"')
quoting (default QUOTE_MINIMAL)
skipinitialspace (default False)
strict (default False)
*/
}
Dialects are validated at construction: delimiter must be a single character, quoting must be one of the QUOTE_* constants, and quotechar must be set when quoting != QUOTE_NONE.
Reader_iternext
// CPython: Modules/_csv.c:780 Reader_iternext
static PyObject *
Reader_iternext(ReaderObj *self)
{
PyObject *lineobj = PyIter_Next(self->input_iter);
if (lineobj == NULL) return NULL; /* StopIteration */
/* State machine: START_RECORD -> START_FIELD -> IN_FIELD /
IN_QUOTED_FIELD -> END_FIELD. Handles doubled quotes
(doublequote=True) and escape characters. */
return fields; /* list of str */
}
The parser is a hand-written character-by-character state machine. It handles embedded newlines in quoted fields by calling PyIter_Next again for continuation lines.
Writer_writerow
// CPython: Modules/_csv.c:1020 csv_writerow
static PyObject *
csv_writerow(WriterObj *self, PyObject *seq)
{
/* For each field: convert to str, decide quoting, escape specials.
Join with delimiter, append lineterminator, write to output. */
return PyObject_CallMethodOneArg(self->writeline, &_Py_ID(write), rec);
}
QUOTE_MINIMAL only quotes fields that contain the delimiter, quotechar, or line terminator. QUOTE_ALL quotes every field. QUOTE_NONNUMERIC quotes non-numeric fields and returns floats when reading.
QUOTE_* constants
// CPython: Modules/_csv.c:120 quoting constants
#define QUOTE_MINIMAL 0 /* quote only when needed */
#define QUOTE_ALL 1 /* always quote */
#define QUOTE_NONNUMERIC 2 /* quote non-numeric; return float on read */
#define QUOTE_NONE 3 /* never quote; use escapechar */
#define QUOTE_STRINGS 4 /* like QUOTE_NONNUMERIC but only quote str */
#define QUOTE_NOTNULL 5 /* quote all except None */
QUOTE_NONE requires an escapechar to be set, or writing raises csv.Error when the delimiter appears in a field.
DictReader / DictWriter
# CPython: Lib/csv.py:82 DictReader
class DictReader:
def __init__(self, f, fieldnames=None, restkey=None, restval=None,
dialect='excel', *args, **kwds):
self.reader = reader(f, dialect, *args, **kwds)
self.fieldnames = fieldnames # None: read from first row
def __next__(self):
row = next(self.reader)
if self.fieldnames is None:
self.fieldnames = row
row = next(self.reader)
return dict(zip(self.fieldnames, row))
DictReader and DictWriter are thin Python wrappers in Lib/csv.py around the C reader/writer. DictReader auto-reads fieldnames from the first row if not supplied.
gopy notes
csv.reader and csv.writer are implemented in module/csv/module.go porting the C state machine directly. Dialect is module/csv.Dialect. DictReader/DictWriter are pure Python wrappers in stdlib/csv.py.