NaN handling
Float columns
Section titled “Float columns”Float NaN values are handled via IEEE 754 semantics (x != x is true iff NaN).
Note the asymmetry with the column-form helpers below: float helpers operate on
tensors (extract the column with get_float_col first), while the _col
helpers take a Frame plus a column name.
is_nan(col): returns a bool tensor (true = missing)fill_nan(col, fill_val): replace NaN withfill_valin a float tensordrop_nan(df, col_name): remove rows where the column is NaNany_nan(col): true if any value is NaNcount_nan(col): count of NaN values
Integer columns
Section titled “Integer columns”Integer columns carry an explicit boolean missing-value mask (true = missing).
The mask is propagated through joins (sentinel rows), aggregations (masked rows
are skipped), and concat. Use the _col variants:
is_nan_col(df, col_name): returns the bool mask tensorfill_nan_col(df, col_name, fill_val): replace masked entries withfill_valdrop_nan_col(df, col_name): remove rows where the mask is trueany_nan_col(df, col_name): true if any entry is maskedcount_nan_col(df, col_name): count of masked entries
To construct an int column directly: int_col_of_list([1, 2, 3]) creates an
IntCol with an all-false (no missing) mask.