Every validation rule Alphacast runs against an uploaded CSV, and how the deleteMissingFromDB / onConflictUpdateDB / acceptNewColumns flags control row-level conflict resolution.
Once the manifest is in place, every upload runs through a fixed sequence of validations, then applies the merge logic governed by three flags. This page documents both — what causes an upload to fail, and what happens to each row when it succeeds.
If the manifest declares a date column, the CSV must contain it. Every value must parse against the dataset’s configured date format.
Failure
Error message
Missing date column
Data has not Date column
Unparseable date
Date should be formatted as <format>
Alphacast tries the configured format, then a lowercased variant, then splits on whitespace and retries — so 2024-01-01 00:00:00 parses fine against %Y-%m-%d.
After Alphacast strips whitespace from entity values, every entity column must have a value on every row. Empty cells are first replaced by the literal string 'none', but if a value is genuinely missing — for example, NaN propagated from a prior calculation — the upload fails.
If the CSV contains columns the dataset has never seen before and the dataset already has a manifest from a prior upload and you did not pass acceptNewColumns=true, the upload fails.
Failure
Error message
Unknown column without opt-in
Unknown column(s): X. Missing column(s): Y - consider renaming to original names or add 'AcceptNewColumn' parameter to confirm adding new column(s)
To add columns intentionally, pass acceptNewColumns=true. To rename a column instead, use the manifest’s destinationName field — see Manifest.
Two rows with the same combination of entity-column values (including the date column) are treated as duplicates. The upload fails and reports the count plus the first ten offending key combinations.
Failure
Error message
Duplicate rows
<N> rows are duplicated. First 10 rows as example: [...]
If your data legitimately has multiple measurements per Date + Country, you need an additional entity column (e.g. Sector or Source) to disambiguate them.
Columns marked Short Integer in the manifest must hold values in the [-32768, 32767] range. Decimals, NaNs after coercion, or out-of-range integers all fail.
Failure
Error message
Out-of-range Short Integer
Column '<column>' cannot be cast to Short Integer (Int16): <details>
For values larger than 32,767, use Decimal instead.
Columns declared Decimal go through automatic numeric inference (handles 1,200.50, 1.200,50, 12.5%). If fewer than 80% of non-null values parse as numeric, Alphacast silently downgrades the column to String. There is no error in this case, but the column won’t behave as a measurement — inspect the column types after the first upload to catch this.
Rows present in the dataset but missing from the upload are kept untouched. Use this for incremental uploads that only add or update recent data.
true
Rows present in the dataset but missing from the upload are deleted. Use this when the upload is the new authoritative state of the dataset (a full replace).
deleteMissingFromDB=true permanently deletes any existing row whose entity key does not appear in the upload. Verify your CSV contains every row you want to keep before setting this flag.
deleteMissingFromDB is forced to false when the manifest is locked.
A conflict is a row whose entity key matches an existing row but where at least one value column differs.
Value
Behavior
false (default)
Conflicts are resolved by keeping the existing value in the dataset. The upload’s value is discarded for those rows.
true
Conflicts are resolved by overwriting the existing value with the upload’s value. Use this when the upload is meant to be the latest correct version of the data.
If more than 10% of values changed during a conflicting upload, the process appends a warning to statusDescription (WARNING: More than 10% of values changed. This may indicate a problem with the data.).