TableHDU¶
Binary table HDU (BINTABLE). Subclass of rustfits.HDU.
- class rustfits.TableHDU¶
Bases:
HDU- __getitem__(key, /)¶
Return self[key].
- __len__()¶
Return len(self).
- __setitem__(key, value, /)¶
Set self[key] to value.
- add_checksum()¶
Compute and store both
DATASUMandCHECKSUMcards.CHECKSUMis the encoded complement of the running checksum over (header + data) so that the total HDU checksum lands on0xFFFFFFFF, per the FITS Checksum Convention. This is the call most users want — it writes both cards atomically.See
add_datasum()for the manual-refresh contract.
- add_datasum()¶
Compute and store the
DATASUMchecksum card.DATASUMis the unsigned-32-bit checksum of the HDU’s data section, per the FITS Checksum Convention. Call after any write that changes the data —write(),append(),__setitem__,insert_column(),delete_column(),repack(). rustfits does NOT auto-refresh checksums on mutation; the user opts in explicitly because checksum computation can be expensive on large data sections.See also
add_checksumAlso compute the full HDU
CHECKSUMcard.verify_datasumCompare the stored
DATASUMagainst the current data bytes.
- append(data, *, names=None)¶
Append rows to the end of the table.
Grows
NAXIS2in the header and the data section to fit the new rows. For HDUs that are not the last on disk, the file tail is shifted forward and every later HDU’s offsets are bumped in lockstep; previously-issued handles remain valid.- Parameters:
data (numpy.ndarray, dict, or list/tuple of ndarrays) – Same three input forms as
write(): a structured ndarray, a{name: ndarray}dict, or a list/tuple of per-column ndarrays paired withnames=. Length defines the number of new rows.names (list of str, optional) – Required for the list/tuple form; ignored otherwise.
Notes
Validate-then-mutate: input is fully validated (columns, dtypes, shapes) before any file or header bytes are touched, so a dtype mismatch can’t leave the file half-grown.
Mid-write I/O failures taint the file — subsequent reads and writes will raise until the user closes and reopens.
See also
extendAlias of
append, kept for symmetry withImageHDU.extend().writeOverwrite all rows in place.
- appending()¶
No-op batched-append context manager.
Uncompressed table appends already go straight to disk (no partial-trailing-tile re-encode tax to amortize), so this context does nothing on enter or exit — it exists for API symmetry with
CompressedTableHDU.appending(), where the context does meaningful work. Generic code that iterates HDUs of mixed compressed / uncompressed types can use the pattern uniformly:for hdu in fits: with hdu.appending(): for batch in batches: hdu.append(batch)
- colnames¶
Column names in on-disk order, as a tuple.
Names are returned with their on-disk case preserved verbatim. Lookup against this list (e.g. by
read()’scolumns=argument) is case-insensitive throughout the API. Returned as a tuple so the value is immutable from the caller side.
- delete_column(key)¶
Remove a column from the table.
- Parameters:
key (str or int) – Column name (case-insensitive) or a 0-based integer index. Negative indices wrap from the end.
Notes
Works on both fixed and VLA columns. For a VLA column, the descriptor bytes are removed from each row but the heap cells the column referenced are left as-is — they become orphans that
repack()reclaims. Existing other VLA columns are preserved across the delete; their heap relocates backward to sit after the new (shorter) main rows.Row shuffle runs in 1 MiB front-to-back strips so peak memory stays bounded. Mid-write I/O failures taint the file (close + reopen to recover).
- dtype¶
The numpy structured dtype the table reads into.
Reflects the default-read (
scale=True) dtype — i.e. columns with the TSCAL/TZERO unsigned trick appear asu2/u4/u8/i1, and other scaled columns asf8. Useful for inspecting the column layout (names, per-cell shapes, types) without paying for an actual read.- Returns:
Structured dtype with one field per column.
- Return type:
- extend(data, *, names=None)¶
Alias for
append().Kept for symmetry with
ImageHDU.extend()so generic code that iterates HDUs and calls.extend(...)on each continues to work. The primary table-side name isappend()because that’s the natural verb for adding rows to a table.
- extending()¶
Alias for
appending(). MirrorsCompressedImageHDU.extending()for parity with the image side, so generic code that iterates HDUs of any type can usewith hdu.extending():uniformly.
- extname¶
EXTNAMEheader value, orNonewhen the keyword is absent.EXTNAME is the user-visible name of the HDU (e.g.
'SCI','CATALOG'). Combined withextverit’s the standard way to identify HDUs without relying on position-by-index.
- extver¶
EXTVERheader value, defaulting to1when absent.Per the FITS standard, multiple HDUs may share an
EXTNAMEand are distinguished byEXTVER. Returns1rather thanNonefor the absent case so callers can compare/select without handlingOptional[int].
- has_data¶
Trueiff this HDU has a non-empty data section.Works uniformly across image and table HDUs: the test is
NAXIS > 0AND everyNAXISn > 0. For images that means “at least one pixel”; for tables it means “at least one row of at least one column”.Useful for picking the first HDU worth reading in a file (primary HDUs are often empty stubs):
hdu = next(h for h in fits if h.has_data) arr = hdu.read()
Edge case: a VLA table with
NAXIS2=0butPCOUNT>0(heap-only) returnsFalse— no main rows means there’s nothing to interpret the heap through, which is the right answer for the “is this HDU worth reading?” question.
- header¶
The HDU’s
FITSHeader.Returns a live view of this HDU’s header cards. Mutations via the header object (
__setitem__,__delitem__,update,add_comment,add_history,add_blank) write through to disk immediately, following the disk-write-before-commit ordering documented onFITSHeader.Reads are cheap; mutations may grow the reserved header blocks in place if the new card list exceeds the current allotment.
- index¶
The HDU’s 0-based position in its file.
Stable for the lifetime of the
FITSobject — even when an earlier HDU grows and shifts this HDU’s bytes forward, the index is unchanged because the HDU is still at the same position in the file’s HDU list.
- insert_column(name, data, *, position=None, after=None, before=None, unit=None, inner_dtype=None, heap_format=None, bit_packed=False)¶
Insert a new column into the table.
- Parameters:
name (str) – Column name (becomes
TTYPEn). Must not duplicate an existing column (case-insensitive check).data (numpy.ndarray) – Column values, shape
(NAXIS2,) + per_cell_shape. For fixed columns the dtype determines the FITS letter (i2/i4/i8/u1/u2/u4/u8/f4/f8/c8/c16/b1+S/Ustrings); the unsigned-int trick onu2/u4/u8emitsTZERO. For VLA columns, pass Object dtype with one inner ndarray per row and setinner_dtype=.position (int, optional) – 0-based column index in the result,
0..=ncols.ncolsappends at the end (also the default when none of position / after / before is set). Mutually exclusive withafterandbefore.after (str or int, optional) – Insert after this column. Accepts a name (case-insensitive) or a 0-based integer index (negative wraps). Mutually exclusive with
positionandbefore.before (str or int, optional) – Insert before this column. Same rules as
after. Mutually exclusive withpositionandafter.unit (str, optional) –
TUNITnstring.inner_dtype (str, optional) – Required when
datais Object dtype (VLA insert). Inner element dtype as a string:'f4'/'i4'/'?'etc. Maps to the FITS inner-element letter.heap_format ({'P', 'Q'}, optional) – For VLA columns only.
'P'(default) uses 8-byte descriptors with a 4 GB heap ceiling;'Q'uses 16-byte descriptors with no practical ceiling.bit_packed (bool, optional) – For boolean columns only. If
True, emit anX(orPX/QXfor VLA) bit-packed column instead of the defaultL(one byte per bool). DefaultFalse.
- Raises:
ValueError – Duplicate name; multiple position kwargs set; unknown position; dtype mismatch; row count mismatch with
NAXIS2;inner_dtype/heap_formatset on non-Object input; or the file uses a non-defaultTHEAP(seerepack()for the same limitation).
Notes
Strip-based row shuffler bounds peak memory at ~1 MiB regardless of table size. Existing VLA columns are preserved across the insert; their heap is relocated forward to sit after the new (wider) main rows. Mid-write I/O failures taint the file (close + reopen to recover).
- iter(*, chunksize=None, columns=None, scale=True)¶
Iterate over table rows or row-chunks.
hdu.iter()is equivalent tofor row in hdu— one row per iteration as a numpy scalar record. Passingchunksizeswitches to yielding structured arrays instead.- Parameters:
chunksize (int, optional) –
None(default) yields one row per iteration as a numpy scalar record (the same single-element valuehdu[i]returns). A positive integer yields a structurednumpy.ndarrayof up tochunksizerows per iteration; the final chunk may be shorter.0is rejected.columns (list of str, optional) – Restrict iteration to these columns (case-insensitive), forwarded to
read(). Each yielded record / chunk then carries only the named fields. This is the supported way to iterate a column subset — a singleiter(columns=["x"])still yields 1-field records, so userow["x"]to get the value.scale (bool, default True) – Apply
TSCALn/TZEROnscaling, forwarded toread().
- Returns:
Yields numpy scalar records (
chunksize=None) or structurednumpy.ndarraychunks (chunksize=N).- Return type:
iterator
Notes
The row count is snapshotted when the iterator is created; rows added via
append()mid-iteration are not seen. Closing the file mid-iteration makes the next batch read raise the usual closed-file error. The internal read buffer is auto-sized to an ~8 MiB byte budget (rows = budget / row_width) and is not currently user-configurable — for a huge-row table, drive a manualhdu[lo:hi]loop instead.Works identically on
CompressedTableHDU, decoding only the tiles each batch touches.
- ncols¶
Number of columns in the table (
TFIELDS).
- nrows¶
Number of rows in the table.
Reads the
NAXIS2header keyword. Equivalent tolen(hdu); both are provided for symmetry with numpy (len(arr)) and pandas (df.nrows) idioms.
- read(*, rows=None, columns=None, scale=True, mask_null=False)¶
Read rows from the table into a numpy structured array.
- Parameters:
rows (slice, list of int, or None, optional) – Rows to read.
None(default) reads every row in file order. A slice or iterable of ints selects a subset; negative indices are supported. Iterables are deduped with first-occurrence-wins ordering.columns (list of str, or None, optional) – Column names to read.
None(default) reads every column in file order. A list selects + reorders; matching is case-insensitive against the table’s column names.scale (bool, optional) – If
True(default), applyTSCAL/TZEROscaling: the unsigned-int trick promotes to the matching unsigned dtype with no precision loss, and other scaling producesf8. IfFalse, return raw stored values in the on-disk BITPIX dtype.mask_null (bool, optional) – If
True, return anumpy.ma.MaskedArraywith per-field bool masks setTruewhere the stored integer equalsTNULLn. The mask compare is in stored-int space (pre-scaling), so it composes correctly with theTSCAL/TZEROpaths. Only applies to integer fixed-width columns; variable-length columns withTNULLare rejected. DefaultFalse.
- Returns:
Structured array of shape
(n_selected,)with one field per selected column. Dtype reflects thescalechoice (scaled values forTrue, raw stored dtype forFalse).- Return type:
- Raises:
ValueError – If a row index is out of range, a column name is unknown, or
mask_null=Trueis requested on a variable-length column carryingTNULL.
Notes
Both the
rows=andcolumns=subsets validate fully before any file I/O happens, so an invalid selection leaves the file untouched.Examples
Read the whole table:
arr = hdu.read()
Read three columns from rows 100..200:
arr = hdu.read(rows=slice(100, 200), columns=["RA", "DEC", "MAG"])
Read with masking on a column that has
TNULL=-99:arr = hdu.read(mask_null=True) assert arr["FLAG"].mask.any()
- read_column(name, *, rows=None, as_bytes=False, scale=True, mask_null=False)¶
Read a single column into a plain (non-structured) ndarray.
Equivalent to
hdu.read(columns=[name])[name]but skips the structured-array packaging — useful when you only want one column’s data.- Parameters:
name (str) – Column name. Case-insensitive against the table’s
TTYPEnvalues.rows (slice, list of int, or None, optional) – Same semantics as
read()’srows=.as_bytes (bool, optional) – Only meaningful for
A(character) columns. IfTrue, return the on-disk bytes in anS<n>field with no decode, no NUL-truncation, and no trailing-space strip — useful when a column has non-ASCII bytes that the defaultUdecode would reject. DefaultFalse.mask_null (bool, optional) – If
Trueand this column carriesTNULL, return anumpy.ma.MaskedArray. DefaultFalse.
- Returns:
Array of shape
(n_selected,) + field_shape—field_shapeis empty for scalar columns,(repeat,)or theTDIMshape for subarray columns.- Return type:
- repack()¶
Rebuild the VLA heap, reclaiming orphan cells.
VLA writes (
__setitem__on a variable-length column) always append new cell bytes to the end of the heap, leaving the old bytes as orphans referenced by no descriptor.repack()walks every live descriptor, streams the referenced bytes into a compact new heap, and rewrites the descriptors to point at it. If the heap shrinks, the on-disk file shrinks too: the last HDU usesset_len, and a non-last HDU shifts the trailing HDUs backward in lockstep.No-op for tables without VLA columns or with an already-compact heap.
- Raises:
ValueError – If the file uses a non-default
THEAP(heap offset other thanNAXIS1 * NAXIS2). Rustfits never emits such files itself; the limitation only blocks repacking files written by other tools with a custom heap offset. Workaround: rewrite through a freshFITS.create_table_hdu()+write().
- shape¶
Shape of the table, equivalent to (hdu.nrows, )
- units¶
Per-column units (
TUNITn), as a dict.Maps each column name (case preserved) to its
TUNITnstring, orNonewhenTUNITnis unset for that column. Dict iteration follows on-disk column order.Informational only — nothing in the read path consumes units.
- Returns:
{column_name: unit_or_None}.- Return type:
- verify_checksum()¶
Verify the stored
CHECKSUMover the full HDU.- Returns:
Trueif the storedCHECKSUMmatches the current header + data;Falseif it doesn’t;Noneif theCHECKSUMcard is absent.- Return type:
bool or None
- verify_datasum()¶
Verify the stored
DATASUMagainst the current data.- Returns:
Trueif the storedDATASUMmatches the current data section;Falseif it doesn’t;Noneif theDATASUMcard is absent.- Return type:
bool or None
- write(data, *, names=None)¶
Bulk-write data into the table’s data section.
Overwrites all
NAXIS2rows; for appending new rows instead, useappend(). Accepts three input forms, all normalizing through the same per-column strip-write kernel:- Parameters:
data (numpy.ndarray, dict, or list/tuple of ndarrays) –
Structured ndarray — field names must match the HDU’s columns (extras, missing, or duplicates rejected); field order may differ from HDU order.
len(data)must equalNAXIS2.Dict
{name: ndarray}— one entry per HDU column; extras / missing rejected. Each value is a per-column ndarray with shape(NAXIS2,) + per_cell_shape.List or tuple of ndarrays with
names=[...]— parallel sequences; same per-column model as dict.
names (list of str, optional) – Required only when
datais a list/tuple of ndarrays. Ignored for the structured-ndarray and dict forms.
- Raises:
ValueError – Field name mismatch, missing/extra columns, length mismatch with
NAXIS2, or per-cell shape mismatch.
Notes
Validate-then-mutate: any dtype/shape error is raised BEFORE the file is touched, so an invalid input leaves the table unchanged.
See also
appendAdd new rows to the table.
__setitem__Modify a subset of rows / columns / cells.